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Introduction 


Modularity 

The  entities  constructed  by  programming  are  extremely  complex. 
Accurate  construction  of  large  programs  would  be  impossible  without 
specific  techniques  for  controlling  this  complexity.  Most  such  techniques 
are  based  on  finding  ways  to  decompose  a problem  into  almost  independently 
solvable  subproblems,  allowing  a programmer  to  concentrate  on  one 
subproblem  at  a time,  ignoring  the  others.  When  the  subproblems  are 
solved,  the  programmer  must  be  able  to  combine  the  solutions  with  a minimum 
of  unanticipated  interactions.  To  the  extent  that  a decomposition  succeeds 
in  breaking  a programming  problem  into  manageable  pieces,  we  say  that  the 
resulting  program  is  modular;  each  part  of  the  solution  is  called  a 
module.  Well-designed  programming  languages  provide  features  which  support 
the  construction  of  modular  programs. 

One  decomposition  strategy  is  the  packaging  of  common  patterns  of 
the  use  of  a language.  For  example,  in  Algol  a for  loop  captures  a common 
pattern  of  if  and  goto  statements.  Packages  of  common  patterns  are  not 
necessarily  merely  abbreviations  to  save  typing.  While  a simple 
abbreviation  has  little  abstraction  power  because  a user  must  know  what  the 
abbreviation  expands  into,  a good  package  encapsulates  a higher  level 
concept  which  has  meaning  independent  of  its  implementation.  Once  a 
package  is  constructed  the  programmer  can  use  it  directly,  without  regard 
for  the  details  it  contains,  precisely  because  it  corresponds  to  a single 
notion  he  uses  in  dealing  with  the  programming  problem. 

A package  is  most  useful  if  its  behavior  is  independent  of  the 
context  of  its  use,  thus  reducing  possible  interference  with  other 
packages.  Such  a package  is  called  referentially  transparent . 
Intuitively,  referential  transparency  requires  that  the  meanings  of  parts 
of  a program  be  apparent  and  not  change,  so  that  such  meanings  can  be 
reliably  depended  upon.  In  particular,  names  internal  to  one  module  should 
not  affect  or  be  affected  by  other  modules  — the  external  behavior  of  a 
module  should  be  independent  of  the  choice  of  names  for  its  local 
identifiers . 

To  make  a modular  program,  it  is  often  necessary  to  think  of  a 
computational  process  as  having  state.  In  such  cases,  if  the  state  can  be 
naturally  divided  into  independent  parts,  an  important  decomposition  may  be 
the  division  of  the  program  into  pieces  which  separately  deal  with  the 
parts  of  the  state. 

We  will  discuss  various  stylistic  techniques  for  achieving 
modularity.  One  would  expect  these  techniques  to  complement  each  other. 
We  will  instead  discover  that  they  can  come  into  conflict.  Pushing  one  to 
an  extreme  in  a language  can  seriously  compromise  others. 
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LISP-like  Languages 

Of  the  hundreds  or  thousands  of  computer  languages  which  have  been 
invented,  there  is  one  particular  family  of  languages  whose  common  ancestor 
was  the  original  LISP,  developed  by  McCarthy  and  others  in  the  late  I950's. 
[LISP  History]  These  languages  are  generally  characterized  by  a simple, 
fully  parenthesized  ("Cambridge  Polish")  syntax;  the  ability  to  manipulate 
general,  linked-list  data  structures;  a standard  representation  for 
programs  of  the  language  in  terms  of  these  structures;  and  an  interactive 
programming  system  based  on  an  interpreter  for  the  standard  representation. 
Examples  of  such  languages  are  LISP  1.5  [LISP  1.5M],  MacLISP  [Moon], 
IntcrLISP  [Teiteiman],  CONNIVER  [McDermott  and  Sussman],  QA4  [Rulifson], 
PLASMA  [Smith  and  Hewitt]  [Hewitt  and  Smith],  and  SCHEME  [SCHEME]  [Revised 
Report].  We  will  call  this  family  the  LISP-like  languages. 

The  various  members  of  this  family  differ  in  some  interesting  and 
often  subtle  ways.  These  differences  have  a profound  impact  on  the  styles 
of  programming  each  may  encourage  or  support.  We  will  explore  some  of 
these  differences  by  examining  a series  of  small  ("toy")  evaluators  which 
exhibit  these  differences  without  the  clutter  of  "extra  features"  provided 
in  real,  production  versions  of  LISP-like  language  systems. 

The  series  of  evaluators  to  be  considered  partially  constitute  a 
reconstruction  of  what  we  believe  to  be  the  paths  along  which  the  family 
evolved.  These  paths  can  be  explained  after  the  fact  by  viewing  the 
historical  changes  to  the  language  as  being  guided  by  the  requirements  of 
various  aspects  of  modularity. 


Structure  of  the  Paper 

Our  discussion  is  divided  into  several  parts,  which  form  a linear 
progression.  In  addition,  there  are  numerous  large  digressions  which 
explore  interesting  side  developments.  These  digressions  are  placed  at  the 
end  as  notes,  cross-referenced  to  and  from  the  text. 

We  exhibit  a large  number  of  LISP  interpreters  whose  code  differs 
from  one  to  another  in  small  ways  (though  their  behavior  differs  greatly!). 
In  order  to  avoid  writing  identical  pieces  of  code  over  and  over,  each 
figure  exhibits  only  routines  which  differ,  and  also  contains  cross- 
references  to  preceding  figures  from  which  missing  routines  for  that  figure 
are  to  be  drawn. 

Part  Zero  introduces  the  restricted  dialect  of  the  LISP  language  in 
which  most  of  our  examples  arc  written.  It  also  discusses  the  basic 
structure  of  an  interpreter,  and  exhibits  a meta-circular  interpreter  for 
the  language. 

Part  One  introduces  procedural  data  as  an  abstraction  mechanism, 
and  considers  its  impact  oh  variable  scoping  disciplines  in  the  language. 
We  are  forced  through  a series  of  such  disciplines  as  unexpected 
interactions  are  uncovered  and  fixed.  Interpreters  are  exhibited  for 
dynamic  scoping  and  lexical  scoping. 

Part  Two  considers  the  problems  associated  with  the  decomposition 
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of  state.  Side  effects  are  introduced  as  a mechanism  for  effecting  such 
decompositions.  We  find  that  the  notion  of  side  effect  is  inextricably 
wound  up  with  the  notion  of  identity.  Dynamic  scoping  is  retrospectively 
viewed  as  a restricted  kind  of  side  effect. 

With  this  we  summarize  and  conclude  with  many  tantalizing  questions 
yet  unanswered. 

In  Part  Three  (in  a separate  paper)  we  will  find  that  the 
introduction  of  side  effects  forces  the  issue  of  the  order  of  evaluation  of 
expressions.  We  will  contrast  call-by-name  and  its  variants  with  call-by- 
value, and  discuss  how  these  control  disciplines  arise  as  a consequence  of 
different  models  of  packaging.  In  particular,  call-by-name  arises 
naturally  from  the  syntactic  nature  of  the  Algol  60  .copy  rule.  As  before, 
many  little  interpreters  for  these  disciplines  will  be  exhibited. 

In  Part  Four  we  will  be  led  to  generalize  the  notion  of  a syntactic 
package.  We  will  discuss  meta-procedures,  which  deal  with  the 
representations  of  procedures.  The  distinction  between  a procedure  and  its 
representation  will  be  more  carefully  considered.  Macro  processors, 
algebraic  simplifiers,  and  compilers  will  be  considered  as  meta-procedures. 
Various  interpreters,  compilers,  and  simplifiers  will  be  exhibited. 
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Recursion  Equations 


Contrary  to  popular  belief,  LISP  was  not  originally  derived  from 
Church's  k-calculus  [Church]  [LISP  History].  The  earliest  LISP  did  not 
have  a well-defined  notion  of  free  variables  or  procedural  objects.  Early 
LISP  programs  were  similar  to  recursion  equations,  defining  functions  on 
symbolic  expressions  ("S-expressions").  They  differed  from  the  equations 
of  pure  recursive  function  theory  [Kleene]  by  introducing  the  conditional 
expression  construction  (often  called  the  "McCarthy  conditional"),  to  avoid 
"pattern-directed  invocation".  That  is,  in  recursive  function  theory  one 
would  define  the  factorial  function  by  the  following  two  equations: 

factorial(O)  = 1 

factorial successor(x) ) = successor(x)  * factorial(x) 

In  early  LISP,  however,  one  would  have  written: 


factorial[x]  = [x=0  — * 1 ; T — > x*factorial[x-l ]] 


where  "[a  -»  b;  T -»  c]"  essentially  means  "IT  a then  b else  c".  The 
recursive  function  theory  version  depends  on  selecting  which  of  two 
equations  to  use  by  matching  the  argument  to  the  left-hand  sides  (such  a 
discipline  is  actually  used  in  the  PROLOG  language  [Warren]);  the  early 
LISP  version  represents  this  decision  as  a conditional  expression. 

The  theory  of  recursion  equations  deals  with  functions  over  the 
natural  numbers.  In  LISP,  however,  one  is  interested  in  being  able  to 
manipulate  algebraic  expressions,  programs,  and  other  symbolic  expressions 
as  data  structures.  While  such  expressions  con  be  encoded  as  numbers 
(using  the  technique  of  "arithmetization"  developed  by  Kurt  Godel),  such  an 
encoding  is  not  very  convenient.  Instead,  a new  kind  of  data  called  "S- 
expressjons"  (for  "symbolic  expressions")  is  introduced  specifically  to 
provide  convenient  encodings.  S-expressions  can  be  defined  by  a set  of 
formal  inductive  axioms  analogous  to  the  Peano  postulates  used  to  define 
natural  numbers.  Here  we  will  give  only  an  informal  and  incomplete 
definition  of  S-expressions;  for  a more  complete  description,  see  {Note  S- 
expression  Postulates  and  Notation). 

For  our  purposes  we  will  need  only  the  special  cases  of  S- 
expressions  called  atoms  and  lists . An  atom  is  an  "indivisible"  data 
object,  which  we  denote  by  writing  a string  of  letters  and  digits;  if  only 
digits  are  used,  then  the  atom  is  considered  to  be  a number.  Many  special 
characters  such  as  "-"  and  "+"  are  considered  to  be  letters;  we  will  see 
below  that  it  is  not  necessary  to  specially  reserve  them  for  use  as 
operator  symbols.  A list  is  a (possibly  empty)  sequence  of  S-expressions, 
notated  by  writing  the  S-expressions  in  order,  between  a set  of  parentheses 
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and  separated  by  spaces.  A list  of  the  atoms  "FOO",  "43",  and  "BAR"  would 
be  written  "(FOO  43  BAR)".  Notice  that  the  definition  of  a list  is 
recursive.  For  example, 

(DEFINE  (SECOND  X)  (CAR  (CDR  X))) 

is  a list  of  three  things:  the  atomic  symbol  define,  a list  of  the  two 
atomic  symbols  second  and  x,  and  another  list  of  two  other  things. 

We  can  use  S-expressions  to  represent  algebraic  expressions  by 
using  "Cambridge  Polish"  notation,,  essentially  a parenthesized  version  of 
prefix  Polish  notation.  Numeric  constants  are  encoded  as  numeric  atoms; 
variables  are  encoded  as  non-numeric  atoms  (which  henceforth  we  will  call 
atomic  symbols);  and  procedure  invocations  are  encoded  as  lists,  where  the 
first  element  of  the  list  represents  the  procedure  and  the  rest  represent 
the  arguments.  For  example,  the  algebraic  expression  "a*b«-c*d"  can  be 
represented  as  "(+  (*  a b)  (*  c d))".  Notice  that  LISP  does  not  need  the 
usual  precedence  rules  concerning  whether  multiplication  or  addition  is 
performed  first;  the  parentheses  explicitly  define  the  order.  Also,  all 
procedure  invocations  have  a uniform  syntax,  no  matter  how  many  arguments 
are  involved.  Infix,  superscript,  and  subscript  notations  are  not  used; 
thus  the  expression  "Jp(x2+1)"  would  be  written  "(j  p (♦  (♦  x 2)  1))". 

To  encode  a conditional  expression 

CPj  “F  Oj ; P^  f e? ; ...  ; pN  -*  0^] 

(which  means  to  evaluate  the  predicates  p^  in  order  until  a true  one  is 
found,  at  which  point  the  value  of  e^  is  taken  to  be  the  value  of  the 
conditional)  we  write  the  S-expression 

(CONO  (Pj  ej)  (p2  e2)  ...  (pn  en)) 

We  can  now  encode  sets  of  LISP  recursion  equations  as  S- 
expressions.  For  the  equation 

factorial[x]  = [x=0  -*  1;  T -»  x*factorial[x-l ]] 

we  write  the  5-expression 

(DEFINE  (FACTORIAL  X) 

(COND  ((=  X 0)  1) 

(T  (*  X (FACTORIAL  (-  X 1)))))) 

(We  could  also  have  written 

(0EFINE  (FACTORIAL  X)  (CONO  ((- 
X 0)  1)  (T  (*  X (FACTORIAL  (-  X 

D))))) 

but  we  conventionally  lay  out  S-expressions  so  that  they  are  easy  to  read.) 
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We  now  hove  a complete  encoding  for  algebraic  expressions  and  LISP 
recursion  equations  in  the  form  of  S-expressions . Suppose  that  we  now  want 
to  write  a LISP  program  which  will  take  such  an  S-expression  and  perform 
some  useful  operation  on  it,  such  as  determining  the  value  of  an  algebraic 
expression.  We  need  some  procedures  for  distinguishing,  decomposing,  and 
constructing  S-expressions. 

The  predicate  atom,  when  applied  to  an  S-expression,  produces  true 
when  given  an  atom  and  false  otherwise.  The  empty  list  is  considered  to  be 
an  atom.  The  predicate  null  is  true  of  only  the  empty  list;  its  argument 
need  not  be  a list,  but  may  be  any  S-expression.  The  predicate  numberp  is 
true  of  numbers  and  false  of  atomic  symbols  and  lists.  The  predicate  EQ, 

when  applied  to  two  atomic  symbols,  is  true  if  the  two  atomic  symbols  are 

identical.  It  is  false  when  applied  to  an  atomic  symbol  and  any  other  S- 
expression.  (We  have  not  defined  eq  on  two  lists  yet;  this  will  not 
become  important,  or  even  meaningful,  until  we  discuss  side  effects.) 

The  decomposition  operators  for  lists  are  traditionally  called  car 
and  cdr  for  historical  reasons.  [LISP  History]  car  extracts  the  first 
element  of  a list,  while  cdr  produces  a list  containing  all  elements  but 
the  first.  Because  compositions  of  car  and  cor  are  commonly  used  in  LISP, 

an  abbreviation  is  provided:  all  the  C's  and  R's  in  the  middle  can  be 

squeezed  out.  For  example,  "(cor  (cor  (car  (cor  x))))"  can  be  written  as 

" ( C00A0R  X)" . 

The  construction  operator  cons,  given  an  S-expression  and  a list, 
produces  a new  list  whose  car  is  the  S-expression  and  whose  cdr  is  the 
list.  The  operator  list  can  take  any  number  of  arguments  (a  special 
feature),  and  produces  a list  of  its  arguments. 

We  can  now  write  some  interesting  programs  in  LISP  to  deal  with  S- 
expressions.  For  example,  we  can  write  a predicate  equal,  which  determines 
whether  two  S-expressions  have  the  same  car-cor  structure: 

(DEF  INE  (EQUAL  X Y) 

(C0ND  ((NUMBERP  X) 

(CONO  ((NUMBERP  Y)  ( ■=  X V) ) 

(T  NIL))) 

((ATOM  X)  (EQ  X Y)) 

((ATOM  Y)  NIL) 

((EQUAL  (CAR  X)  (CAR  Y)) 

(EQUAL  (COR  X)  (COR  Y) ) ) ) ) 

Here  we  have  used  the  standard  names  T and  nil  to  represent  true  and  false . 
(Traditionally  nil  is  also  considered  to  be  the  empty  list,  but  we  will 
avoid  this  here,  writing  "()"  for  the  empty  list.) 

Because  LISP  programs  are  represented  as  LISP  data  structures  (S- 
expressions ) , there  is  a difficulty  with  representing  constants.  For 
example,  suppose  we  want  to  determine  whether  or  not  the  value  of  the 
variable  x is  the  atomic  symbol  "foo".  We  might  try  writing: 
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This  doesn't  work.  The  occurrence  of  "foo"  does  not  refer  to  the  atomic 
symbol  foo  as  a constant;  it  is  treated  as  a variable,  just  as  "x"  is. 

The  essential  problem  is  that  we  want  to  be  able  to  write  any  S- 
expression  as  a constant  in  a program,  but  some  S-expressions  must  be  used 
to  represent  other  things,  such  as  variables  and  procedure  invocations.  To 
solve  this  problem  we  invent  a new  notation:  (quote  x)  in  a program 
represents  the  constant  5-expression  x.  (Note  quote  flapping}  Thus  we  can 
write  our  test  as  "(eq  x (quote  foo))".  Similarly, 

( EQUAL  X (LIST  Y Z)) 

constructs  a list  from  the  values  of  y and  z,  and  compares  the  result  to 
the  value  cf  x,  while 

(EQUAL  X (QUOTE  (LIST  Y Z))) 

compares  the  value  of  x to  the  constant  S-expression  "(list  y z)".  Because 
the  quote  construction  is  used  so  frequently  in  LISP,  we  use  an  abbreviated 
notation:  "'foo"  is  equivalent  to  "(quote  foo)".  This  is  only  a notational 
convenience;  the  two  notations  denote  the  same  S-expression.  (S- 
expressions  are  not  character  strings,  but  data  objects  with  a certain 
structure.  We  use  character  strings  to  notate  S-expressions  on  paper,  but 
we  can  use  other  notations  as  well,  such  as  little  boxes  and  arrows.  We 
can  and  do  allow  several  different  character  strings  to  denote  the  same  S- 
expression . ) 


An  Interpreter  for  LISP  Recursion  Equations 

We  now  have  enough  machinery  to  begin  our  examination  of  the 
genetic  history  of  LISP.  We  first  present  a complete  interpreter  for  LISP 
recursion  equations.  The  language  interpreted  is  a dialect  of  LISP  which 
allows  no  free  variables  except  for  names  of  primitive  or  defined 
procedures,  and  no  definitions  of  procedures  within  other  procedures. 

The  driver  loop  reads  in  definitions  of  procedures  of  the  form: 

i 

(DEFINE  (F  A B C ...)  (expression  In  A B C ...  and  F G H ...>) 

I 

I 

and  saves  them.  It  can  also  read  in  requests  to  apply  some  defined 

procedure  to  some  arguments  (or,  more  generally,  to  evaluate  any 
expression),  in  which  case  it  prints  the  resulting  value.  An  expression 
may  consist  of  variable  references,  constants  (numbers  and  quoted  s- 
expressions),  procedure  calls,  and  conditional  expressions  (cond).  The 
defined  procedures  may  refer  to  each  other  and  to  initially  supplied 

primitive  procedures  (such  as  car,  cons,  etc.).  Definitions  may  contain 
"forward  references",  as  long  as  all  necessary  definitions  are  present  at 
the  time  of  a request  for  a computation.  The  interpreter  itself  is 

presented  here  as  a set  of  such  definitions,  and  so  is  meta-circular. 

The  language  is  intended  to  be  evaluated  in  applicative  order; 
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that  is,  all  arguments  to  a procedure  are  fully  evaluated  before  an  attempt 
is  made  to  apply  the  procedure  to  the  arguments.  (It  is  necessary  to  state 
this  explicitly  here,  as  it  is  not  inherent  in  the  form  of  the  meta- 
circular definition.  See  [Reynolds]  for  an  explication  of  this  problem.) 

The  driver  loop  (see  Figure  1)  is  conceptually  started  by  a request 
to  invoke  driver  with  no  arguments.  Its  task  is  to  first  print  the  message 
"LITHP  ITH  LITHTENING"  (a  tradition  of  sorts)  and  then  invoke  driver-loop. 
The  expression  <the-primitive-proceoures>  is  intended  to  represent  a constant 
list  structure,  containing  definitions  of  primitive  procedures,  to  be 
supplied  to  driver-loop. 


(DEFINE  (DRIVER) 

(DRIVER-LOOP  < THE -PR IMI T I VE -PROCEDURES)  (PRINT  • (LITHP  ITH  L I THTENING  I ) ) ) 

(DEFINE  (DRIVER-LOOP  PROCEDURES  HUNOZ ) 

(DRIVER-LOOP-1  PROCEDURES  (READ))) 

(DEFINE  ( OR  1 VER-LOOP - 1 PROCEDURES  FORM) 

(COND  ((ATOM  FORM) 

(DRIVER-LOOP  PROCEDURES  (PRINT  (EVAL  FORM  '()  PROCEDURES)))) 

( ( EO  (CAR  FORM)  'DEFINE) 

(ORIVER-LOOP  (BIND  (LIST  (CAADR  FORM)) 

(LIST  (LIST  (CDADR  FORM)  (CAOOR  FORM))) 
PROCEDURES) 

(PRINT  (CAADR  FORM)))) 

(T  (ORIVER-LOOP  PROCEDURES  (PRINT  (EVAL  FORM  '()  PROCEDURES)))))) 

Figure  1 

Top  Level  Driver  Loop  for  a Recursion  Equations  Interpreter 


driver-loop  reads  an  S-expression  from  the  input  stream  and  passes 
it,  along  with  the  current  procedure  definitions,  to  driver-loop-1.  This 
procedure  in  turn  determines  whether  the  input  S-expression  is  a 

definition.  If  it  is,  then  it  uses  bind  (described  below)  to  produce  an 
augmented  set  of  procedure  definitions,  prints  the  name  of  the  defined 

procedure,  and  calls  driver-loop  to  repeat  the  process.  The  augmented  set  of 
procedures  is  passed  to  oriver-loop,  and  so  the  variable  procedures  always 
contains  all  the  accumulated  definitions  ever  read.  If  the  input  S- 
expression  is  not  a definition,  then  it  is  given  to  the  evaluator  eval, 

whose  purpose  is  to  determine  the  values  of  expressions.  (Note  Value 
Quibble)  The  set  of  currently  defined  procedures  is  also  passed  to  eval. 

The  process  carried  on  by  the  driver  loop  is  often  called  the  "top 
level";  all  user  programs  and  requests  are  run  "under"  it.  The  growing 
set  of  procedure  definitions  is  called  the  "top-level  environment";  this 
environment  changes  in  the  course  of  the  user  interaction,  and  contains  the 
state  of  the  machine  as  perceived  by  the  user.  It  is  within  this 

environment  that  user  programs  are  executed. 
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(DEFINE  (EVAL  EXP  ENV  PROCEDURES) 

(CONO  ((ATOM  EXP) 

(COND  ((EQ  EXP  'NIL)  ’NIL) 

(<EQ  EXP  ’T)  * T ) 

((NUMBERP  EXP)  EXP) 

(T  (VALUE  EXP  ENV)))) 

((EQ  (CAR  EXP)  'QUOTE) 

(CADR  EXP)) 

((EQ  (CAR  EXP)  'CONO) 

( EVCOND  (COR  EXP)  ENV  PROCEDURES)) 

(T  (APPLY  (VALUE  (CAR  EXP)  PROCEDURES) 

(EVLIS  ( COR  EXP)  ENV  PROCEDURES) 
PROCEDURES)))) 


(OEFINE  (APPLY  FUN  ARGS  PROCEDURES) 

(CONO  ((PRIMOP  FUN)  ( PRIMOP-APPLY  FUN  ARGS)) 

(T  (EVAL  (CADR  FUN) 

(BIND  (CAR  FUN)  ARGS  '( )) 
PROCEDURES)))) 

(DEFINE  (EVCOND  CLAUSES  ENV  PROCEDURES) 

(COND  ((NULL  CLAUSES)  (ERROR)) 

((EVAL  (CAAR  CLAUSES)  ENV  PROCEDURES) 

(EVAL  (CADAR  CLAUSES)  ENV  PROCEDURES)) 

(T  (EVCOND  (COR  CLAUSES)  ENV  PROCEDURES)))) 


(DEFINE  (EVLIS  ARGLIST  ENV  PROCEDURES) 

(COND  ((NULL  ARGLIST)  '()) 

(T  (CONS  (EVAL  (CAR  ARGLIST)  ENV  PROCEDURES) 

(EVLIS  (COR  ARGLIST)  ENV  PROCEDURES))))) 

Figure  2 

Evaluator  for  a Recursion  Equations  Interpreter 


The  evaluator  proper  (see  Figure  2)  is  divided  into  two  conceptual 
components:  eval  and  apply,  eval  classifies  expressions  and  directs  their 
evaluation.  Simple  expressions  (such  as  constants  and  variables)  can  be 
evaluated  directly.  For  the  complex  case  of  procedure  invocations 
(technically  called  "combinations"),  eval  looks  up  the  procedure 
definition,  recursively  evaluates  the  arguments  (using  evlis),  and  then 
calls  apply.  apply  classifies  procedures  and  directs  their  application. 
Simple  procedures  (primitive  operators)  are  applied  directly.  For  the 
complex  case  of  user-defined  procedures,  apply  uses  bind  to  build  an 
environment,  a kind  of  symbol  table,  associating  the  formal  parameters  from 
the  procedure  definition  with  the  actual  argument  values  provided  by  eval. 
The  body  of  the  procedure  definition  is  then  passed  to  eval,  along  with  the 
environment  just  constructed,  which  is  used  to  determine  the  values  of 
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variables  occurring  in  the  body. 

In  more  detail,  eval  is  a case  analysis  on  the  structure  of  the  S- 
expression  exp.  If  it  is  an  atom,  there  are  several  subcases.  The  special 
atoms  t and  nil  are  defined  to  evaluate  to  r and  nil  (this  is  strictly  for 
convenience,  because  they  are  used  as  truth  values).  Similarly,  for 
convenience  numeric  atoms  evaluate  to  themselves.  (These  cases  could  be 
eliminated  by  requiring  the  user  to  write  lots  of  quote  forms:  'T,  'nil, 
'43,  etc.  This  would  have  been  quite  inconvenient  in  early  LISP,  before 
the  notation  had  been  introduced;  one  would  have  had  t.o  write  (quote 
43),  etc.)  Atomic  symbols,  however,  encode  variables;  the  value 
associated  with  that  symbol  is  extracted  from  the  environment  env  using  the 
function  value  (see  below). 

If  the  expression  to  be  evaluated  is  not  atomic,  then  it  may  be  a 
quote  form,  a conp  form,  or  a combination.  For  a quote  form,  eval  extracts 
the  S-expression  constant  using  cacr.  Conditionals  are  handled  by  evcond, 
which  calls  eval  on  a predicate  expression;  if  the  predicate  is  true, 

evcond  evaluates  the  corresponding  result  expression  (by  calling  eval,  of 
course);  if  the  predicate  is  false,  evcond  calls  itself  to  test  the 

predicate  of  the  next  clause  of  the  cond  body.  For  combinations,  the 
procedure  is  obtained,  the  arguments  evaluated  (using  evlis),  and  apply 

called  as  described  earlier.  Notice  that  value  is  used  to  get  the 

procedure  definition  from  the  set  procedures;  we  can  do  this  because,  as  an 
engineering  trick,  we  arrange  for  env  and  procedures  to  have  the  same 

structure,  because  they  are  both  symbol  tables. 

evlis  is  a simple  recursive  function  which  calls  eval  on  successive 
arguments  in  arglist  and  produces  a list  of  the  values  in  order. 

apply  distinguishes  two  kinds  of  procedures:  primitive  and  user- 
defined.  For  now  we  avoid  describing  the  precise  implementation  of 

primitive  procedures  by  assuming  the  existence  of  a predicate  primop  which 
is  true  only  of  primitive  procedures,  and  a function  primop-apply  which  deals 
with  the  application  of  such  primitive  procedures.  (See  (Note  Primitive 
Operators)  for  the  details  of  a possible  implementation  of  primop  and  primop- 
apply.)  We  consider  primitive  procedures  to  be  a kind  of  atomic  S- 
expression  other  than  numbers  and  atomic  symbols;  we  define  no  particular 
written  notation  for  them  here.  However,  primitive  procedures  are  not  to 
be  confused  with  the  atomic  symbols  used  as  their  names.  The  result  of 
(value  ’car  procedures)  is  not  the  atomic  symbol  car,  but  rather  some  bizarre 
object  which  is  meaningful  only  to  primop-apply. 

User-defined  procedures  are  represented  here  as  lists.  These  lists 
are  constructed  by  driver-loop-i . The  car  of  the  list  is  the  list  of  formal 

parameters,  and  the  cadr  is  the  body  of  the  definition. 
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(DEFINE  (BIND  VARS  ARCS  ENV) 

(COND  ((«  (LENGTH  VARS)  (LENGTH  ARCS)) 

(CONS  (CONS  VARS  ARGS)  ENV)) 

(T  (ERROR)))) 

(DEFINE  (VALUE  NAME  ENV) 

( VALUE  1 NAME  (LOOKUP  NAME  ENV))) 

(DEFINE  (VALUE  1 NAME  SLOT) 

(CONO  ,( ( EQ  SLOT  ’AUNBOUND)  (ERROR)) 

(T  (CAR  SLOT)))) 

(DEFINE  (LOOKUP  NAME  ENV) 

(CONO  ((NULL  ENV)  'AUNBOUND) 

(T  ( LOOKUP  1 NAME  (CAAR  ENV)  (COAR  ENV)  ENV)))) 

(OEFINE  ( LOOKUP  1 NAME  VARS  VALS  ENV) 

(COND  ((NULL  VARS)  (LOOKUP  NAME  (COR  ENV))) 

((EQ  NAME  (CAR  VARS))  VALS) 

(T  ( LOOKUP 1 NAME  (COR  VARS)  (COR  VALS)  ENV)))) 

Figure  3 

Utility  Routines  for  Maintaining  Environments 


The  interpreter  uses  several  utility  procedures  for  maintaining 
symbol  tables  (see  Figure  3).  A symbol  table  is  represented  as  a list  of 
buckets;  each  bucket  is  a list  whose  car  is  a list  of  names  and  whose  cdr 
is  a list  of  corresponding  values.  (Note  This  ain't  A-lists)  If  a variable 
name  occurs  in  more  than  one  bucket,  the  leftmost  such  bucket  has  priority; 
in  this  way  new  symbol  definitions  added  to  the  front  of  the  list  can 
supersede  old  ones. 

bind  takes  a list  of  names,  a list  of  values,  and  a symbol  table, 
and  produces  a new  symbol  table  which  is  the  old  one  augmented  by  an  extra 
bucket  containing  the  new  set  of  associations.  (It  also  performs  a useful 
error  check  — length  returns  the  length  of  a list.) 

value  is  essentially  an  interface  to  lookup.  We  define  it  because 
later,  in  Part  Three,  we  will  want  to  use  different  versions  of  valuei 
without  changing  the  underlying  algorithm  in  lookup.  The  check  for  aunbound 
catches  incorrect  references  to  undefined  variables. 

lookup  takes  a name  and  a symbol  table,  and  returns  that  portion  of 
a bucket  whose  car  is  the  associated  value.  (This  definition  will  be  more 
useful  later  than  one  in  which  the  value  itself  is  returned.) 

Note  carefully  the  use  of  the  variable  procedures  in  the 
interpreter.  When  driver-loop-i  calls  eval  it  passes  the  current  list  of 
defined  procedures  (both  primitive  and  user-defined),  oriver-loop-1  is  the 
only  routine  which  augments  the  value  of  procedures,  and  this  value  is  only 


Steele  and  Sussman 


12 


The  Art  of  the  Interpreter 


[used  in  eval,  when  it  is  passed  to  value.  However,  all  of  the  routines 

apply,  evcond,  and  evlis  have  to  know  about  procedures,  and  dutifully  pass  it 
along  so  that  it  may  be  eventually  used  by  eval.  The  set  of  definitions 
must  be  passed  along  because  there  is  no  provision  for  free  variables  or 
side  effects;  there  is  no  way  to  have  "memory"  or  "state"  other  than  in 
passed  variables.  The  absence  of  free  variables  effectively  causes  our 
language  to  be  referentially  transparent.  However,  we  sense  a disturbing 
lack  of  modularity  in  the  use  of  procedures  (and,  to  a lesser  extent,  in  the 
use  of  env  — look  at  evcond  and  evlis).  We  will  return  to  this  point  later. 

Our  recursion  equations  language  has  no  special  iteration  or 
looping  constructs,  such  as  the  Algol  for  statement  or  the  FORTRAN  DO  loop. 
All  loops  are  constructed  by  arranging  for  recursive  procedures  to  call 
themselves  or  each  other.  For  example,  evcond  (see  Figure  2)  iterates  over 
the  clauses  of  a cono  by  calling  itself  on  successive  "tails"  of  the  list 
of  clauses.  Now  such  recursive  calls  may  strike  the  reader  familiar  with 


other  languages  (such  as  Algol,  FORTRAN,  PL/I,  etc.)  on  an  intuitive  level 
as  being  rather  inefficient  for  implementing  real  programs.  Even  granted 
that  calls  might  be  made  fast,  they  would  seem  to  consume  space  in  the  form 
of  return  addresses  and  other  control  information.  Examination  of  the 
recursion  equations  evaluator  will  show,  however,  that  this  phenomenon  does 
not  have  to  occur.  This  is  because  no  extra  information  is  saved  if  there 
is  nothing  left  to  do  on  return  from  a recursive  call.  See  [SCHEME]  and 
[Debunking]  for  a more  thorough  discussion  of  this. 


■ 
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Part  One 


Variable  Scoping  Disciplines 


Procedures  as  Data 

The  simple  LISP  described  in  Part  Zero  can  be  a pleasant  medium  for 
encoding  rather  complex  algorithms,  including  those  of  symbolic 
mathematics.  Often  lists  are  used  for  representing  such  structures  as  the 
set  of  coefficients  of  a polynomial  or  coordinates  of  a space  vector.  Many 
problems  require  one  to  perform  an  operation  on  each  element  of  a list  and 
produce  a new  list  of  the  results.  For  example,  it  may  be  useful  to  make  a 
list  of  the  squares  of  each  of  the  elements  in  a vector.  We  would  write 
this  as  follows: 

(DEFINE  ( SQUAREl 1ST  l) 

(COND  ((NULL  l)  •()) 

(T  (CONS  (SQUARE  (CAR  l)) 

( SQUAREL 1ST  (CDR  L)))))) 

We  find  ourselves  writing  this  pattern  over  and  over  again: 

(DEFINE  ( f L 1ST  L) 

(COND  ((NULL  L)  ’()) 

(T  (CONS  (r  (CAR  L)) 

( fL  1ST  (CDR  L)))>)) 

where  r is  a function  defined  on  the  elements  of  our  list.  It  would  be 
nice  to  be  able  to  define  an  entity  of  the  programming  language  which  would 
capture  this  abstract  pattern.  The  "obvious"  solution  is  to  write  the 
variable  function  as  a functional  variable  which  can  be  accepted  as  an 
argument: 

(DEFINE  (MAPCAR  F L) 

(COND  ((NULL  L)  '()) 

(T  (CONS  (F  (CAR  L)) 

(MAPCAR  F (CDR  L)))))) 

(mapcar  is  the  traditional  name  of  this  abstraction.)  Using  this  we  could 
say: 

(MAPCAR  S0UARE  '(1  2 3)) 

Unfortunately,  this  will  not  work  in  our  recursion  equations  interpreter. 
Why  not? 

The  essence  of  the  problem  is  that  our  interpreter  segregates 
procedures  from  other  kinds  of  objects.  We  refer  to  F as  a procedure  but 
it  was  passed  in  as  a variable.  Procedures  are  only  looked  up  in  the 
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procedures  symbol  table,  but  variables  are  bound  in  env.  Moreover,  in  the 
call  to  mapcar,  souare  is  used  as  a variable,  which  is  looked  up  in  env,  but 
its  definition  is  only  available  in  procedures. 

Let's  merge  the  two  symbol  tables...  How  could  that  hurt? 


(DEFINE  (DRIVER-LOOP-1  ENV  FORM) 

(COND  ((ATOM  FORM) 

(DRIVER-LOOP  ENV  (PRINT  (EVAL  FORM  ENV)))) 

((EQ  (CAR  FORM)  'DEFINE) 

(DRIVER-LOOP  (BIND  (LIST  (CAADR  FORM)) 

(LIST  (LIST  ' ^PROCEDURE  ( CDADR  FORM)  (CADOR  FORM))) 
ENV) 

(PRINT  ( CAAOR  FORM)))) 

(T  (DRIVER-LOOP  ENV  (PRINT  (EVAL  FORM  ENV)))))) 

For  driver-loop  see  Figure  1. 

For  eval  see  Figure  5. 

For  bind  see  Figure  3. 

Figure  4 

Modified  Driver  Loop  for  Treating  Procedures  as  Objects 


We  will  eliminate  procedures,  and  use  env  to  contain  both  procedures 
and  other  objects.  The  driver  loop  requires  no  particular  changes  (see 
Figure  4),  except  for  eliminating  the  argument  ’()  in  the  calls  to  eval.  We 
will  change  the  name  procedures  to  env  throughout  as  well,  but  of  course  that 
isn't  logically  necessary,  because  our  language  is  referentially 
transparent.  (Snicker!)  (Note  evalquote) 

(We  have  introduced  a funny  object  aproceoure  which  we  use  to  flag 
procedural  objects.  In  the  previous  interpreter  it  was  impossible  for  the 
user  to  request  application  of  an  object  which  was  not  either  a primitive 
operator  or  a procedure  produced  by  a DEFINE  form.  Now  that  procedures 
mingle  freely  with  other  data  objects,  it  is  desirable  to  be  able  to 
distinguish  them,  e.g.  for  error  checking  in  apply.  We  also  have  some 
deeper  motivations  having  to  do  with  avoiding  the  confusion  of  a procedure 
with  its  textual  representation,  but  we  do  not  want  to  deal  with  this  issue 
yet. ) 

To  fix  up  the  evaluator,  we  eliminate  all  occurrences  of  procedures. 
In  eval,  where  the  name  of  a procedure  in  a combination  is  looked  up,  we 
change  it  to  perform  the  lookup  in  env.  Finally,  there  is  a problem  in 
apply:  if  the  call  to  eval  to  evaluate  the  body  is  simply 

(EVAL  (CADOR  FU(I) 

(BIND  (CADR  FUN)  ARGS  '( ))) 


then  the  new  env  given  to  eval  does  not  have  the  procedure  definitions  in 
it.  Moreover.  APPLY  does  not  even  have  access  to  an  environment  which 
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contains  the  procedure  definitions  (because  its  parameter  procedures  was 
deleted)!  We  can  easily  fix  this.  When  apply  is  called  from  eval,  env  can 
be  passed  along  (as  procedures  used  to  be),  and  the  call  to  eval  from  apply 
can  be  changed  to 

(EVAL  (CADDR  FUN) 

(BIND  (CADR  FUN)  ARGS  ENV)) 

In  this  way  the  environment  passed  to  eval  will  contain  the  new  variable 
bindings  added  to  the  old  environment  containing  the  procedure  definitions. 
(See  Figure  5.)  This  is  indeed  a good  characteristic:  if  the  name  of  a 
defined  procedure  is  used  as  a local  variable  (procedural  or  otherwise), 
the  new  binding  takes  precedence  locally,  temporarily  superseding  the 
global  definition. 
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(DEFINE  (EVAl  EXP  ENV) 

(COND  ((ATOM  EXP) 

(COND  ((NUMBERP  EXP)  EXP) 

(T  (VALUE  EXP  ENV)))) 

((EQ  (CAR  EXP)  'QUOTE) 

(CAOR  EXP)) 

((EQ  (CAR  EXP)  'COND) 

( EVCONO  (CDR  EXP)  ENV)) 

(T  (APPLY  (VALUE  (CAR  EXP)  ENV) 

(EVL1S  (CDR  EXP)  ENV) 

ENV)))) 

(DEFINE  (APPLY  FUN  ARGS  ENV) 

(CONO  ((PRIMOP  FUN)  ( PR IMOP- APPLY  FUN  ARGS)) 
((EQ  (CAR  FUN)  '^PROCEDURE) 

(EVAL  ( CADDR  FUN) 

(BIND  (CAOR  FUN)  ARGS  ENV))) 

(T  (ERROR)))) 


(DEFINE  ( EVCOND  CLAUSES  ENV) 

(COND  ((NULL  CLAUSES)  (ERROR)) 

((EVAL  (CAAR  CLAUSES)  ENV) 

(EVAL  (CADAR  CLAUSES)  ENV)) 

(T  (EVCOND  (COR  CLAUSES)  ENV)))) 

(DEFINE  (EVLIS  ARGLIST  ENV) 

(COND  ((NULL  ARGLIST)  '(  )) 

(T  (CONS  (EVAL  (CAR  ARGLIST)  ENV) 

(EVLIS  (CDR  ARGLIST)  ENV))))) 

For  value  and  bind  see  Figure  3. 

Figure  5 

Evaluator  for  Treating  Procedures  as  Objects 


Another  good  thing  about  this  version  of  the  interpreter  is  that 
the  gross  non-modularity  of  the  scattered  occurrences  of  procedures  has 
disappeared.  The  problem  has  not  been  solved,  of  course,  but  we  certainly 
feel  relieved  that  the  particular  manifestation  has  been  removed! 

By  the  way,  we  also  eliminated  the  explicit  tests  for  t and  nil  in 
eval,  assuming  that  we  can  simply  put  their  initial  values  in  the  Initial 
environment  provided  by  oriver. 

An  interesting  property  of  this  interpreter  is  that  free  variables 
now  have  been  given  a meaning,  though  we  originally  did  not  intend  this. 
Indeed,  in  the  original  recursion  equations  interpreter,  there  were  free 
variables  in  a sense:  all  procedural  variables  were  free  (but  they  could 
be  used  only  in  operator  position  in  a combination).  In  our  new 
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interpreter,  thanks  to  the  merging  of  the  procedural  and  variable 
environments,  we  may  have  not  only  bound  procedure  names,  but  also  free 
variable  names,  for  after  all  the  two  kinds  of  names  are  now  one. 

This  interpreter  differs  in  only  small  details  from  the  one  in  LISP 
1.5  [LISP  1 . 5N ] . Both  have  dynamically  scoped  free  variables  (we  will 
elaborate  on  this  point  later).  We  might  note  that  the  reference  to  value 
in  eval  when  computing  the  first  argument  for  apply  can  be  replaced  by  a 
reference  to  eval;  this  does  the  same  thing  if  a variable  appears  in  the 
operator  position  of  a combination,  and  allows  the  additional  general 
ability  to  use  any  expression  to  compute  the  procedure.  This  difference  in 
fact  appears  in  the  LISP  1.5  interpreter.  There  are  other  slight 
differences,  such  as  the  representation  of  primitive  operators  and  the 
handling  of  procedures  which  are  not  primitive  or  user-defined.  Aside  from 
these,  the  greatest  difference  between  our  interpreter  and  LISP  1.5's  is 
the  use  of  lambda  notation.  This  we  will  meet  in  the  next  section. 


I 
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We  now  have  the  ability  to  define  and  use  the  mapcar  procedure. 
After  some  more  experience  in  programming,  however,  we  find  that,  having 
abstracted  the  common  pattern  from  our  loops,  that  the  remaining  part  (the 
functional  argument)  tends  to  be  different  for  each  invocation  of  mapcar. 
Unfortunately,  our  language  for  all  practical  purposes  requires  that  we  use 
a name  to  refer  to  the  functional  arguments,  because  the  only  way  we  have 
to  denote  new  procedures  is  to  define  names  for  them.  We  soon  tire  of 
thinking  up  new  unique  names  for  trivial  procedures: 

(DEFINE  ( F00BAR-A  3 X)  (*  (+  X A)  3)) 

. . . (MAPCAR  FOOBAR-A3  l) 

We  run  the  risk  of  name  conflicts;  also,  it  would  be  nice  to  be  able  to 
write  the  procedure  definition  at  the  single  point  of  use. 

More  abstractly,  given  that  procedures  have  become  referenceable 
objects  in  the  language,  it  would  be  nice  to  have  a notation  for  them  as 
objects,  or  rather  a way  to  write  an  S-expression  in  code  that  would 
evaluate  to  a procedure.  LISP  [LISP  1 M ] adapted  such  a notation  from  the 
X-calculus  of  Alonzo  Church  [Church]: 

(LAMBDA  <ver1ables>  <body>) 

Comparing  this  with  the  define  notation,  we  see  that  it  has  the  same  parts: 
a keyword  so  that  it  can  be  recognized;  a list  of  parameters;  and  a body. 
The  only  difference  is  the  omission  of  an  irrelevant  name.  It  is  just  the 
right  thing. 

Given  this,  we  can  simply  write 

(MAPCAR  (LAMBDA  (X)  («  X X))  L) 

rather  than  having  to  define  square  as  a separate  procedure.  An  additional 
benefit  is  that  this  notation  makes  it  very  easy  for  a compiler  to  examine 
this  code  and  produce  an  efficient  iterative  implementation,  because  all 
the  relevant  code  is  present  locally  (assuming  the  compiler  knows  about 

MAPCAR) . 

Installing  this  notation  requires  only  a two-line  change  in  eval 
(see  Figure  6). 
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(DEFINE  (EVAl  EXP  ENV) 

(COND  ((ATOM  EXP) 

(COND  ((NUMBERP  EXP)  EXP) 

(T  (VALUE  EXP  ENV)))) 

((EQ  (CAR  EXP)  'QUOTE ) 

(CADR  EXP)) 

((EQ  (CAR  EXP)  'COND) 

( EVCONO  (CDR  EXP)  ENV)) 

( ( EO  (CAR  EXP)  'IAMB0A) 

(CONS  1 &PROCEOURE  (CDR  EXP))) 

(T  (APPLY  (EVAL  (CAR  EXP)  ENV) 

(EVLIS  (CDR  EXP)  ENV) 

ENV)))) 

For  value  see  Figure  3. 

For  apply,  evcono,  and  evlis  see  Figure  5. 

Figure  6 

Evaluator  for  LAMBDA-notation  (Dynamically  Scoped) 


(The  reader  might  have  noticed  that  all  eval  does  for  a lambda- 
expression  is  replace  the  word  lambda  with  the  word  ^procedure,  and  that  we 
could  avoid  that  work  by  uniformly  using  lambda  instead  of  aproceoure  as  the 
flag  for  a procedural  object.  Given  then  that  eval  on  a LAMBDA-expression 
is  an  identity  operation,  we  can  eliminate  the  handling  of  lambda  in  eval 
merely  by  requiring  the  user  to  write  '(lambda  ...)  instead  of  (lambda  ...). 
Although  the  implementors  of  most  LISPs  have  in  fact  done  just  this  ever 
since  LISP  1,  it  is  a very  bad  idea.  eval  is  supposed  to  process 
expressions  and  produce  their  values,  and  the  fact  that  it  might  be 
implemented  as  an  identity  operation  is  no  business  of  the  user.  The 
confusion  between  a procedural  object  and  an  expression  having  that  object 
as  its  value  will  lead  to  serious  trouble.  (Imagine  confusing  15  with 
(■»  7 a),  and  trying  to  take  the  car  of  the  former  instead  of  the  latter,  or 
trying  to  add  3 to  the  latter  instead  of  the  former! ) The  quoted  lambda- 
expression  engineering  trick  discourages  the  implementation  of  a 
referentially  transparent  LISP.  In  Part  Four  we  will  see  the  extreme 
difficulties  for  a LISP  compiler  (or  other  program-understander)  caused  by 
the  blatant  destruction  of  referential  transparency.  (Note  ouote  Shafts 
the  Compiler)) 

The  ability  to  use  free  variables  and  local  procedures  gives  us 
additional  freedom  to  express  interesting  procedures.  For  example,  we  can 
define  a procedure  scale  which  multiplies  a vector  of  arbitrary  length  by  a 
scalar.  If  the  vector  is  represented  as  a list  of  components,  then  we  can 
use  mapcar  and  a local  procedure  with  a free  variable: 
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(DEFINE  (SCALE  S V) 

( MAPCAR  (LAMBDA  (X)  (*  X S)) 

V)) 

Everything  would  be  just  peachy  keen,  except  for  one  small  glitch. 
Suppose  that  the  programmer  who  wrote  scale  for  some  reason  chose  the  name  i 
rather  than  s to  represent  the  scaLar: 

(OEFINE  (SCALE  L V) 

(MAPCAR  (LAMBDA  (X)  (*  X L)) 

V)) 

Although  the  version  with  s works,  the  version  with  L does  not  work.  This 
happens  because  mapcar  also  uses  the  name  l for  one  of  its  arguments  (that 
is,  a "local"  variable).  The  reference  to  L in  the  LAMBDA-expression  in 
scale  refers  to  the  l bound  in  mapcar  and  not  to  the  one  bound  by  scale.  In 
general,  free  variable  references  in  one  procedure  refer  to  the  bindings  of 
variables  in  other  procedures  higher  up  in  the  chain  of  calls.  This 
discipline  is  called  dynamic  scoping  of  variables,  because  the  connection 
between  binding  and  reference  is  established  dynamically,  changing  as 
different  procedures  are  executed. 

That  the  behavior  of  the  scale  program  depends  on  the  choice  of 
names  for  its  local  variables  is  a violation  of  referential  transparency. 
The  modularity  of  the  mapcar  abstraction  has  been  destroyed,  because  no  one 
can  use  that  abstraction  without  understanding  the  details  of  its 
implementation.  This  is  the  famous  "FUNARG  problem"  [Noses]  [LISP 
History] . 

If  we  are  to  avoid  such  conflicts  between  different  uses  of  the 
same  name,  we  must  arrange  our  language  so  that  the  choice  of  names  locally 
cannot  have  global  repercussions.  More  specifically,  we  must  have  the 
ability  to  bind  a variable  in  such  a way  that  it  will  have  a truly  local 
meaning  (though  in  general  we  might  not  want  aH  variables  to  be  strictly 
local  — we  will  consider  later  the  possibility  of  having  several  types  of 
variables) . 
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Lexical  Scoping 

We  now  construct  an  interpreter  in  which  all  variables  have 
strictly  local  usage.  This  discipline  is  called  lexical  scoping  of 
variables,  and  has  been  used  in  many  programming  languages,  including  Algol 
60  [Naur],  The  term  "lexical"  refers  to  the  fact  that  all  references  to  a 
local  variable  binding  are  textually  apparent  in  the  program.  The  term 
static  binding  is  also  used,  indicating  that  the  connection  between  binding 
and  reference  is  unchanging  at  run  time. 

The  difficulty  in  scale  is  that  the  body  of  the  LAMBDA-expression 
(*  x l)  is  evaluated  using  the  env  which  was  available  to  eval  (and  so  passed 
to  apply)  when  it  was  working  on  the  body  of  mapcar.  But  we  want  the  (*  x l) 
to  be  evaluated  using  the  env  which  was  available  when  the  body  of  scale  was 
being  evaluated.  Somehow  we  must  arrange  for  this  environment  to  be 
available  for  evaluating  (*  x l). 

The  correct  environment  was  available  at  the  time  the  lambda- 
expression  was  evaluated  to  produce  a &PROCEDURE-ob  ject . Why  not  just  tack 
the  environment  at  that  point  onto  the  end  of  the  »PROCEOURE-ob ject  so  that 
it  can  be  used  when  the  procedure  is  applied? 

This  is  in  fact  the  right  thing  to  do.  The  object  we  want  to  give 
to  mapcar  must  be  not  just  the  text  describing  the  computation  to  be 
performed,  but  also  the  meanings  of  the  free  variables  referenced  in  that 
text.  Only  the  combination  of  the  two  can  correctly  specify  the 
computation  which  reflects  the  complete  meaning  of  the  abstract  function  to 
be  mapped.  This  is  the  first  place  where  we  find  it  crucial  to  distinguish 
the  three  ideas:  (1)  The  program  — the  text  describing  a procedure,  e.g. 
in  the  form  of  an  S-expression ; (2)  The  procedure  which  is  executed  by  the 
computer;  and  (3)  The  mathematical  function  or  other  conceptual  operation 
computed  by  the  execution  of  the  procedure. 

To  install  lexical  scoping  in  our  interpreter,  we  must  change  the 
treatment  of  LAMBDA-expressions  in  eval  to  make  the  current  environment  env 
part  of  the  SPROCEDURE-ob ject . We  say  that  the  procedure  is  closed  in  the 
current  environment,  and  the  SPROCEDURE-ob  ject  is  therefore  called  a closure 
of  the  procedure,  or  a closed  procedure.  We  must  also  change  apply  to  bind 
the  new  variable-value  associations  onto  the  environment  in  the  bproceoure- 
object,  rather  than  onto  that  passed  by  eval.  When  we  have  done  this,  we 
see  that  in  fact  the  environment  passed  by  eval  is  not  used,  so  we  can 
eliminate  the  parameter  env  from  the  definition  of  apply,  and  change  the 
invocation  of  apply  that  occurs  in  eval.  Thus,  while  the  handling  of  lambda- 
expressions  has  become  more  complicated,  the  handling  of  env  has  been 
correspondingly  simplified.  (See  Figure  7.) 

Had  we  previously  adopted  the  trick  described  in  the  preceding 
section,  wherein  the  user  was  required  to  write  '(lambda  ...)  rather  than 
(lambda  ...),  it  would  have  been  more  difficult  to  adjust  the  interpreter  to 
accommodate  lexical  scoping  — it  would  have  involved  a large  change  rather 
s than  a small  tweak.  (The  change  from  dynamic  scoping  to  lexical  scoping 
does  |p/olve  a gross  change  of  programming  style,  and  this  is  undoubtedly 
why,  once  dynamic  scoping  had  historically  become  the  standard  discipline, 
the  quotation  problem  was  never  cleared  up.  We  will  see  later  that  dynamic 
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scoping  is  a valuable  technique  for  producing  modularity,  but  we  see  no 
virtue  at  all  in  the  confusion  produced  by  quoted  LAMBDA-expressions . While 
quoted  LAMBOA-expressions  do  produce  dynamic  scoping,  the  support  of  dynamic 
scoping  does  not  depend  on  the  quotation  of  LAMBDA-expressions.) 

While  lexical  scoping  solves  our  problems  of  referential 
transparency,  we  will  see  later  that  we  must  in  turn  pay  a large  price  for 
it  — but  it  is  not  a price  of  run-time  efficiency  (contrary  to  popular 
belief) ! 


(DEFINE  (EVAl  EXP  ENV) 

(CONO  ((ATOM  EXP) 

(CONO  ((NUMBERP  EXP)  EXP) 

(T  (VALUE  FXP  ENV)))) 

((EQ  (CAR  EXP)  ’QUOTE) 

(CAOR  EXP)) 

((EQ  (CAR  EXP)  ’LAMBDA) 

(LIST  ' ^PROCEDURE  (CADR  EXP)  ( CADDR  EXP)  ENV)) 

((EQ  (CAR  EXP)  'COND) 

(EVCOND  ( COR  EXP)  ENV)) 

(T  (APPLY  (EVAL  (CAR  EXP)  ENV) 

(EVLIS  (CDR  EXP)  ENV))))) 

(DEFINE  (APPLY  FUN  ARGS) 

(COND  ((PRIMOP  FUN)  ( PR IMOP- APPLY  FUN  ARGS)) 

((EQ  (CAR  FUN)  ’ &PROCEOURE ) 

(EVAL  ( CAOOR  FUN) 

(BIND  (CADR  FUN)  ARGS  (CADDDR  FUN)))) 

(T  (ERROR)))) 

For  value  and  bind  see  Figure  3. 

For  evcono  and  evlis  see  Figure  5. 

Figure  7 

Evaluator  for  Lexically  Scoped  LAMBDA-notation 


Let's  see  what  we  have  bought.  One  thing  we  can  do  is  generalize 
mapcar.  After  yet  more  programming  experience  we  find  that  we  write  many 
MAPCAR-like  procedures.  For  example,  we  might  need  a kind  of  mapcar  where 
the  function  f always  returns  a list,  and  we  want  to  produce  not  a list  of 
the  lists,  but  the  concatenation  of  the  lists.  We  might  also  want  to  take 
the  sum  or  the  product  of  all  the  numbers  in  a list,  or  the  sum  of  the  cars 
of  all  elements  in  a list.  The  general  pattern  is  that  we  look  at  each 
element  of  a list,  do  something  to  it,  and  then  somehow  combine  the  results 
of  all  these  elementwise  operations.  Another  application  might  be  to  chock 
for  duplicates  in  a list;  for  each  element  we  want  to  see  whether  another 
copy  follows  it  in  the  list.  We  further  generalize  the  pattern  to  look  at 
successive  trailing  segments  of  the  list;  we  can  always  take  the  car  to 
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get  a single  element. 

We  could  simply  add  more  procedural  parameters  to  mapcar: 

(DEFINE  (MAP  F OP  ID  L) 

(COND  ((NULL  L)  ID) 

(T  (OP  (F  L) 

(MAP  f OP  ID  (CDR  L)))))) 

Using  this,  we  can  make  a copy  of  the  list  L: 

(MAP  CAR  CONS  •(  ) L) 

We  can  simulate  (mapcar  f l): 

(MAP  (LAMBDA  (X)  (F  (CAR  X)))  CONS  ’()  L) 

Indeed,  we  can  write: 

(DEFINE  (MAPCAR  F L) 

(MAP  (LAMBOA  (X)  (F  (CAR  X)))  CONS  '()  L)) 

We  can  sum  the  elements  of  l: 

(MAP  CAR  + 0 L) 

We  can  take  the  product  of  the  elements  of  L: 

(MAP  CAR  • 1 L) 

We  can  count  the  pairs  of  duplicate  elements  of  L: 

(MAP  (LAMBDA  (X)  X) 

(LAMBDA  (Y  N)  (COND  ((MEMBER  (CAR  Y)  (COR  Y)) 

(+  N 1)) 

(T  N))) 

0 

L) 

If  we  have  occasion  to  take  the  sum  over  lots  of  lists  in  different 
places,  we  might  want  to  package  the  operation  "sum  over  list"  — we  get 
awfully  tired  of  writing  "car  + o".  We  can  write: 

(DEFINE  (MAPGEN  F OP  ID) 

(LAMBOA  (L)  (MAP  F OP  ID  L))) 

The  result  of  (mapgen  car  + o)  we  might  call  sum  - it  is  a procedure  of  one 
argument  which  will  sum  the  elements  of  a list.  The  reason  we  wrote  a 
procedure  to  construct  sum,  rather  than  just  writing: 


J 
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(DEFINE  (SUM  L) 

(MAP  CAR  +01)) 

is  that  mapgen  serves  as  a generalized  constructor  of  such  procedures,  thus 
capturing  an  interesting  abstraction  — we  might  call  the  result  of  (mapgen 
car  * i),  for  example,  product,  and  so  on. 

What  is  interesting  about  this  is  that  we  can  write  procedures 
which  construct  other  procedures.  This  is  not  to  be  confused  with  the 
ability  to  construct  S-expression  representations,  of  procedures;  that 
ability  is  shared  by  all  of  the  interpreters  we  have  examined.  The  ability 
to  construct  procedures  was  not  available  in  the  dynamically  scoped 
interpreter.  In  solving  the  violation  of  referential  transparency  we  seem 
to  have  stumbled  across  a source  of  additional  abstractive  power.  While 
the  map  example  may  seem  strained,  this  example  is  quite  natural:  given  a 
numerical  function,  to  produce  a new  function  which  numerically 
approximates  the  derivative  of  the  first. 

(DEFINE  (DERIVATIVE  F AX) 

(LAMBDA  (X) 

(/  (-  (F  (+  X AX)) 

(F  X)) 

AX))) 

Notice  that  this  is  not  a symbolic  process  dealing  with  the  representation 
of  f.  The  derivative  procedure  knows  nothing  about  the  internal  structure  of 
f.  All  it  does  is  construct  a new  procedure  which  uses  f only  by  invoking 
it.  The  program  derivative  captures  (in  approximation)  the  abstraction  of 
"derivative"  as  a mapping  from  the  space  of  numerical  (and  reasonably  well- 
behaved!  ) functions  to  itself. 

The  ability  to  define  procedures  which  construct  other  procedures 
is  powerful.  We. can  use  it  to  construct  procedures  which  behave  like  data 
objects.  For  example,  since  the  only  constraints  which  cons  must  (so  far) 
obey  are  the  algebraic  identities: 

(car  (cons  a (3))  * a and  (cdr  (cons  a 0))  « (l 

the  value  of  (cons  a (3)  can  be  thought  of  as  a procedure  which  produces  a or 
(3  on  demand  (cf.  [Hewitt  and  Smith]  [Fischer]).  We  can  write  this  as 
follows: 

(DEFINE  (CONS  A D) 

(LAMBDA  (M) 

(CONO  ((*  M 0)  A) 

<(=  M 1)  D)))) 


(DEFINE  (CAR  X)  (X  0)) 


(DEFINE  (CDR  X)  (X  1)) 
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Here  we  have  envisioned  the  value  of  (cons  a fi)  as  a vector  of  two  elements, 
with  zero-origin  indexing.  However,  this  definition  of  cons  makes  use  of 
the  primitive  operator  «.  We  can  define  the  "primitive  operators"  cons, 
car,  and  cor  without  using  another  primitive  operator  at  all!  Following 
[Church],  we  write: 

(DEFINE  (CONS  A 0) 

(LAMBDA  (M)  (M  A 0))) 


(DEFINE  (CAR  X) 

(X  (LAMBOA  (A  D)  A))) 

(DEFINE  (CDR  X) 

(X  (LAMBDA  (A  D)  0))) 


Rather  than  using  o and  l (i.e.  data  objects)  as  selectors,  we  instead  use 
(lambda  (a  D)  a)  and  (lambda  (a  d)  o)  (i.e.  procedures). 

We  can  think  of  the  LAMBDA-expression  which  appears  as  the  body  of 
the  definition  of  derivative  or  of  cons  as  a prototype  for  new  procedures. 
When  derivative  or  cons  is  called,  this  prototype  is  instantiated  as  a 
closure,  with  certain  variables  free  to  the  prototype  bound  to  the 
arguments  given  to  the  constructor. 

At  this  point  it  looks  like  we  have  solved  all  our  problems.  We 
started  with  a rcferentially  transparent  but  expressively  weak  language. 
We  augmented  it  with  procedural  objects  and  a notation  for  them  in  order  to 
capture  certain  notions  of  abstraction  and  modularity.  In  doing  this  we 
lost  the  referential  transparency.  We  have  now  regained  it,  and  in  the 
process  uncovered  even  more  powerful  abstraction  capabilities. 


Levels  versus  Referential  Transparent 


"The  Three  Laws  of  Thermodynamics: 

1 . You  can't  win . 

Z.  You  can't  break  even. 

3.  You  can't  get  out  of  the  game." 
— Unknown 


There  is  no  free  lunch.  We  have  ignored  a necessary  change  to  the 
top  level  driver  loop.  We  have  changed  the  format  of  SPROCEDURE-obJects . 
driver-loop-1  constructs  SPROCEDURE-objects;  it  must  be  rewritten  to 
accommodate  the  change.  We  must  include  an  environment  in  each  such 
object.  The  obvious  fix  is  shown  in  Figure  8. 
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(DEFINE  ( DR  1 VER-LOOP- 1 ENV  FORM) 

(COND  ((ATOM  FORM) 

(DRIVER-LOOP  ENV  (PRINT  (EVAL  FORM  ENV)))) 

((EQ  (CAR  FORM)  'DEFINE) 

(DRIVER-LOOP  (BIND  (LIST  (CAADR  FORM)) 

(LIST  (LIST  ' (PROCEDURE 
(CDADR  FORM) 

(CAODR  FORM) 

ENV)) 

ENV) 

(PRINT  (CAADR  FORM)))) 

(T  (DRIVER-LOOP  ENV  (PRINT  (EVAL  FORM  ENV)))))) 

For  driver-loop  see  Figure  1. 

For  bind  see  Figure  3. 

For  eval  see  Figure  7. 

Figure  8 

Modified  Driver  Loop  for  Lexically  Scoped  LAMBOA-notation 


It  doesn't  work.  This  patch  does  put  the  finishing  touch  on  the 
preservation  of  referential  transparency.  It  does  it  so  well,  that  each 
new  definition  can  only  refer  to  previously  defined  names!  We  have  lost 
the  ability  to  make  forward  references.  We  can't  redefine  a procedure 
which  had  a bug  in  it  and  expect  old  references  to  use  the  new  definition. 
In  fact,  we  cannot  use  define  to  make  a recursive  procedure.  {Note  Y- 
operator)  The  SPROCEDURE-ob  ject  for  each  defined  procedure  contains  an 
environment  having  only  the  previously  defined  procedures. 

We  are  finally  confronted  with  the  fact  that  we  have  been  seeking 
the  impossible.  We  have  tried  to  attain  complete  referential  transparency 
(in  the  expectation  that  modularity  would  be  enhanced),  while  trying  also 
to  retain  the  notion  of  an  incremental,  interactive  top-level  loop  for 
reading  definitions.  But  the  very  existence  of  such  a top  level  Inherently 
constitutes  a violation  of  referential  transparency.  A piece  of  code  can 
be  read  in  which  refers  to  an  as  yet  undefined  identifier  (the  name  of  a 
procedure,  for  example),  and  then  later  a definition  for  that  identifier 
read  in  (thereby  altering  the  meaning  of  the  reference). 

If  we  stubbornly  insist  on  maintaining  absolute  referential 
transparency  in  our  language,  we  are  forced  to  eliminate  the  incremental 
top  level  loop.  A program  must  be  constructed  monolithically . We  must 
read  in  all  our  procedure  definitions  at  once,  close  them  all  together,  and 
then  take  one  or  more  shots  at  running  them.  (This  is  the  way  many  Algol 
implementations  work;  development  of  large  systems  can  be  very  difficult 
if  parts  cannot  be  separately  constructed  and  compiled.)  We  are  forced  to 
give  up  interactive  debugging,  because  we  cannot  redefine  erroneous 
procedures  easily.  We  are  forced  to  give  up  incremental  compilation  of 
separate  modules. 
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We  have  thrown  the  baby  out  with  the  bath  water.  The  very  purpose 
of  referential  transparency  is  to  permit  programs  to  be  divided  into  parts 
so  that  each  part  can  be  separately  specified  without  a description  of  its 
implementation.  The  desirable  result  is  that  pieces  can  be  separately 
written  and  debugged.  {Note  Debugging) 

On  the  other  hand,  if  we  give  up  absolute  referential  transparency, 
we  can  fix  the  top  level  loop.  The  basic  problem  is  that  we  really  want 
procedures  defined  at  top  level  to  be  able  to  refer  to  procedures  defined 
later.  The  problem  with  pure  lexical  scoping  is  that  the  »PROCEOURE-ob  jccts 
are  created  too  early,  when  the  desired  environment  is  not  yet  available. 
We  must  arrange  for  them  to  be  constructed  at  a later  time.  We  could 
simply  use  the  environment  in  use  by  the  caller  at  the  time  of  invocation 
(reverting  to  dynamic  scoping).  But  dynamic  scoping  would  lose  a great 
deal  of  referential  transparency  and  abstractive  power.  Procedures  must 
not  be  allowed  to  refer  to  variables  internal  to  other  procedures,  but  only 
to  top-level  variables  existing  at  the  time  they  are  called.  Therefore 
only  the  future  top-level  environment  is  to  be  included  in  the  sprocedure- 
object  when  it  is  eventually  constructed.  In  this  way  free  variable 
references  will  be  dynamic  only  with  respect  to  the  top-level  environment. 

Considering  our  dynamically-scoped  interpreter  above  (see  Figure 
5),  we  would  be  led  to  modify  apply  again,  to  combine  the  best  properties 
of  the  dynamically  and  lexically  scoped  interpreters.  Indeed,  the  two 
kinds  of  function  can  easily  coexist.  We  borrow  the  code  involving  the 
passing  of  procedures  (including  the  driver-loop,  modified  to  initialize  env  to 
procedures)  from  the  recursion-equations  interpreter  (Figures  1 and  2),  the 
code  for  using  this  top-level  environment  from  the  dynamically-scoped 
interpreter  (Figure  5),  and  the  code  for  constructing  ^procedure -objects  for 
LAMBOA-expressions  from  the  lexically-scoped  interpreter  (Figure  7).  The 
result  appears  in  Figure  9. 
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(DEFINE  (EVAl  EXP  ENV  PROCEDURES) 

(COND  ((ATOM  EXP) 

(COND  ((NUMBERP  EXP)  EXP) 

(T  (VALUE  EXP  ENV)))) 

((EO  (CAR  EXP)  'QUOTE) 

(CADR  EXP)) 

( ( EO  (CAR  EXP)  'LAMBOA) 

(LIST  '^PROCEDURE  (CADR  EXP)  (CADOR  EXP)  ENV)) 
((EQ  (CAR  EXP)  'COND) 

( EVCOND  (COR  EXP)  ENV  PROCEDURES)) 

(T  (APPLY  (EVAL  (CAR  EXP)  ENV  PROCEDURES) 

(EVLIS  (CDR  EXP)  ENV  PROCEDURES) 
PROCEDURES)))) 


(DEFINE  (APPLY  FUN  ARGS  PROCEDURES) 

(COND  ((PRIMOP  FUN)  ( PR IMOP- APPLY  FUN  ARGS)) 

((EQ  (CAR  FUN)  ' &PROCEDURE ) 

(EVAL  (CAODR  FUN) 

(BIND  (CADR  FUN)  ARGS  (CADDDR  FUN)) 
PROCEDURES)) 

(T  (EVAL  (CADR  FUN) 

(BIND  (CAR  FUN)  ARGS  PROCEDURES) 
PROCEDURES)))) 


(DEFINE  ( OR IVER-LOOP- 1 PROCEDURES  FORM) 

(COND  ((ATOM  FORM) 

(DRIVER-LOOP  PROCEDURES 

(PRINT  (EVAL  FORM  PROCEDURES  PROCEDURES)))) 

((EQ  (CAR  FORM)  'DEFINE) 

(DRIVER-LOOP  (BIND  (LIST  (CAAOR  FORM)) 

(LIST  (LIST  (CDAOR  FORM)  (CADDR  FORM))) 
PROCEDURES) 

(PRINT  (CAADR  FORM)))) 

(T  (DRIVER-LOOP  PROCEDURES 

(PRINT  (EVAL  FORM  PROCEDURES  PROCEDURES)))))) 

For  driver-loop  see  Figure  1. 

For  value  and  bind  see  Figure  3. 

For  evcono  and  evlis  see  Figure  2. 

Figure  9 

An  Evaluator  for  Local  Lexical  Scoping 
and  Dynamic  Top-Level  References 


I 

i 
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Ugh  bletch,  procedures  is  back!  Also,  there  are  two  kinds  of  user- 
defined  procedural  objects  floating  around.  There  happens  to  be  another 
way  to  fix  the  top  level,  which  yields  additional  flavor.  We  note  that 
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during  any  one  processing  cycle  of  eval/apply,  procedures  remains  constant. 
We  can  thus  choose  to  associate  the  top  level  environment  with  a top-level 
procedure  at  a time  earlier  than  invocation  time  in  apply.  We  also  note 
that  lookupi  will  have  its  hands  on  the  top-level  environment  anyway  Just 
before  it  locates  the  definition  of  a top-level  procedure.  Exploiting  this 
idea  yields  an  alternate  solution.  {Note  labels) 

In  the  new  driver  (see  Figure  10)  loop  we  no  longer  use  bind  to 
augment  the  top-level  environment  whenever  a new  definition  is  made.  We 
instead  have  all  of  the  top-level  definitions  in  one  frame  of  the 
environment.  When  a new  definition  is  to  be  made  we  extract  the  list  of 
names  and  the  list  of  values  for  the  old  definitions  from  the  old 
environment  and  make  a new  top-level  environment  with  the  lists  of  names 
and  values  separately  augmented. 

Instead  of  creating  &PROCEOURE-ob  jects,  this  driver  loop  creates 
slabeled- objects , which  have  the  same  format  except  that  they  contain  no 
environment.  A »LABELED-ob jcct  is  purely  internal  and  can  never  be  seen  by 
a user  program.  When  lookupi  encounters  such  an  object  as  the  value  of  a 
variable,  it  immediately  creates  the  corresponding  SPROCEDURE-object,  using 
the  environment  at  hand,  which  turns  out  to  be  the  top-level  environment. 
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(DEFINE  (DRIVER-LOOP-1  ENV  FORM) 

(COND  ((ATOM  FORM) 

(DRIVER-LOOP  ENV  (PRINT  (EVAL  FORM  ENV)))) 

((EQ  (CAR  FORM)  ' OEF INE ) 

(DRIVER-LOOP  (LIST  (CONS  (CONS  (CAADR  FORM)  (CAAR  ENV)) 

(CONS  (LIST  ' ALA8ELE0 

(COADR  FORM) 

(CADDR  FORM)) 

(COAR  ENV)))) 

(PRINT  (CAADR  FORM)))) 

(T  (DRIVER-LOOP  ENV  (PRINT  (EVAL  FORM  ENV)))))) 


(DEFINE  ( LOOKUP  1 NAME  VARS  VALS  ENV) 

(COND  ((NULL  VARS) 

(LOOKUP  NAME  (COR  ENV))) 

((EQ  NAME  (CAR  VARS)) 

(COND  ((ATOM  (CAR  VALS))  VALS) 

((EQ  (CAAR  VALS)  'ALABELEO) 

(LIST  'APROCEOURE  (CADAR  VALS)  (CADDAR  VALS)  ENV)) 

(T  VALS))) 

(T  ( LOOKUP  1 NAME  (COR  VARS)  (COR  VALS)  ENV)))) 

For  driver-loop  see  Figure  1. 

For  lookup  see  Figure  3. 

For  eval  see  Figure  7. 

Figure  10 

An  Alternative  Solution  for  Local  Lexical  Scoping 
and  Dynamic  Top-Level  References 
(Modified  Top-Level  Driver  Loop  and  Environment  Lookup) 
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Part  Two 
State 


Decomposition  of  State 


We  saw  in  Part  One  that  an  interactive  top-level  loop  necessarily 
violates  referential  transparency.  We  wish  to  deal  with  the  computer  as  an 
entity  with  state,  which  changes  over  time  by  interacting  with  a user.  In 
particular,  we  want  the  computer  to  change  over  time  by  accumulating 
procedure  definitions. 

Just  as  the  user  wishes  to  think  of  the  computer  as  having  state, 
he  may  find  it  conceptually  convenient  to  organize  a program  similarly: 
one  part  may  deal  with  another  part  having  state.  Often  programs  are 
written  for  the  purpose  of  analyzing  or  simulating  a physical  system.  If 
modules  of  the  program  are  to  reflect  the  conceptual  divisions  of  the 
physical  system,  then  the  program  modules  may  well  need  to  have  independent 
state  variables.  Thus  the  notion  of  state  is  not  just  a programming  trick, 
but  may  be  required  by  the  nature  of  the  problem  domain. 

A simpler  example  of  the  use  of  state  involves  the  use  of  a pseudo- 
random number  generator.  A LISP  version  of  one  might  be: 


(OEFINE  (RANDOM  SEED) 

((LAMBDA  (Z) 

(CONO  ((>  Z 0)  Z) 

(T  (+  Z -32768.)))) 

(*  SEED  899.  ))) 

This  version  of  random  uses  the  power-residue  method  for  a 16-bit  two's- 
complement  number  representation;  the  value  produced  is  a pseudo-random 
integer,  and  also  is  the  seed  for  the  next  call.  The  caller  of  random  is 
required  to  save  this  value  and  supply  it  on  the  next  call  to  random. 

This  fact  is  unfortunate.  The  caller  really  has  no  interest  in  the 
workings  of  random,  and  would  much  prefer  to  simply  call  it  as  "(random)", 
for  example,  and  get  back  a random  number  — because  this  would  reflect 
most  precisely  the  abstract  notion  of  "random  number  generator".  Such  a 
generator  would  have  to  have  state. 

Suppose  we  are  willing  to  live  with  this  nuisance.  Consider  now 
building  some  larger  program  using  random.  Many  levels  up,  the  programmer 
who  writes  some  high-level  routine  very  likely  does  not  care  at  all  that  a 
low-level  routine  uses  random;  he  may  not  even  know  about  the  existence  of 
that  routine.  However,  if  the  state  of  the  pseudo-random  number  generator 
is  to  be  preserved,  that  programmer  will  have  to  deal  with  some  state 
quantity  he  knows  nothing  about,  for  the  sake  of  a program  ten  levels 
removed  from  his  thinking.  Just  as  procedures  had  to  be  passed  all  around 
for  the  sake  of  eval  in  Figure  2,  so  the  state  of  random  must  be  passed  up 
and  down  and  all  around  by  programs  which  don't  really  care.  This  clearly 
violates  our  principle  of  modularity.  (For  an  example  of  how  bad  this  can 
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get,  see  {Note  Gaussian}.) 

As  another  example,  suppose  that  George  writes  mapcar,  and  Harry 
uses  it.  Harry  complains  that  mapcar  is  too  slow.  George  then  decides  to 
collect  some  statistics  about  the  use  of  mapcar,  such  as  the  number  of  times 
called,  the  average  length  of  the  second  argument,  and  so  on.  He  first 
writes  an  experimental  mapcar  to  count  number  of  calls: 

(DEFINE  (MAPCAR  F L N) 

(CONS  (OLOMAPCAR  F L)  (+  N 1))) 

( OEF INE  (OLOMAPCAR  FI) 

(COND  ((NULL  L)  ’( )) 

(T  (CONS  (F  (CAR  L)) 

(OLOMAPCAR  f (COR  L > ) ) ) ) ) 

and  asks  Harry  to  use  it  for  a while  in  his  program.  "I  had  to  add  an 
extra  argument  to  keep  track  of  the  count,"  says  George,  "and  in  order  to 
return  both  the  result  and  the  count,  I had  to  cons  them  together.  Please 
rewrite  your  program  to  keep  track  of  the  count  and  pass  it  on  from  one 
call  on  mapcar  to  the  next."  Harry's  reply  is  "unprintable". 

Now  Bruce  comes  along  and  asks  Harry  how  to  use  Harry's  program. 
Harry  says,  "Just  write  (differentiate  exp  var  n),  where  exp  is  the  expression 
to  be  differentiated,  var  is  the  variable  with  respect  to  which  to 
differentiate,  and  n is  George's  statistics  counter  — but  that  may  go  away 
next  week."  Bruce  gives  Harry  a funny  look,  then  goes  away  and  writes  his 
own  differentiate,  using  George's  documentation  for  the  old  mapcar,  of  course, 
unaware  that  the  new  one  has  been  installed... 

George’s  new  mapcar  conceptually  has  state.  The  state  information 
should  be  local  to  the  definition  of  mapcar,  because  that  information  is  not 
anyone  else's  business,  and  George  has  no  business  requiring  everyone  else 
to  keep  track  of  it  for  him.  George  and  Harry  and  Bruce  all  wish  George 
had  a way  to  maintain  local  state  information  in  mapcar. 


Side  Effects  and  Local  State 

Traditionally  local  state  is  maintained  through  some  sort  of  "side 
effect".  We  can  always  avoid  the  use  of  side  effects  if  we  are  willing  to 
pass  all  state  variables  around.  As  we  have  seen,  this  requires  a 
monolithic  conception  of  the  program  structure.  If  we  wish  to  break  a 
program  up  into  independent  modules,  each  with  local  state  information,  we 
must  seek  another  method. 

We  claim  that  any  such  method  effectively  constitutes  a side 
effect.  If  a module  has  hidden  state,  then  its  behavior  can  potentially 
change  over  time. 

If  only  one  module  in  the  system  has  local  state,  then  we  can  hide 
the  side  effect  by  making  it  the  top-level  module  of  the  system,  as  we  have 
done  for  driver-loop.  (For  an  example  of  this,  see  {Note  Weber).)  If  more 
than  one  module  has  state,  however,  then  each  may  perceive  changes  in  the 


other's  behavior.  This  the  essence  of  side  effect. 

The  concept  of  side  effect  is  induced  by  particular  choices  of 
boundaries  between  parts  of  a larger  system.  If  a system  boundary  encloses 
all  processes  of  interest  (the  system  is  closed),  we  need  no  concept  of 
side  effect  to  describe  that  system  as  a whole  in  vacuo.  If,  however,  we 
wish  to  make  an  abstraction  by  dividing  the  system  into  modules  more  than 
one  of  which  has  independent  state,  then  we  have  by  this  action  created  the 
concept  of  side  effect. 

We  are  forced  to  introduce  side  effects  as  a technique  for 
constructing  modular  systems.  But  side  effects  violate  referential 
transparency  by  altering  the  meanings  of  expressions;  we  expect  (+3  4) 
always  to  mean  the  same  thing,  but  wa  cannot  say  the  same  for  (+  3 (random)). 
Two  techniques  for  achieving  modularity  have  come  into  direct  conflict. 

the  most  common  form  of  side  effect  in  programming  languages  is  the 
assignment  statement,  which  alters  the  meaning  of  a variable.  LISP 
provides  this  notion  in  the  setq  construct: 

(SETO  X 43) 

returns  43,  and  as  a side  effect  niters  the  meaning  of  x so  that  subsequent 
references  will  obtain  43  also. 

With  this,  George  can  now  write: 

(0EFINE  (MAPCAR  F L) 

(MAPCAR1  F L (SETQ  N (♦  N 1)))) 

(DEFINE  (MAPCAR1  F L HUNOZ) 

(OIDMAPCAR  Ft)) 

There  are  still  some  minor  problems  here.  The  function  mapcari  and  the 
variable  hunoz  are  used  solely  to  throw  away  the  value  of  the  setq  form.  It 
is  so  common  to  use  sftq  only  for  its  side  effect  that  another 
construction,  progn,  is  very  useful: 

( PROGN  tj  (|  ...  eN) 

evaluates  each  of  the  forms  e^  in  order,  throwing  away  the  values  of  all 
but  the  last  one.  Notice  that  we  specifically  require  them  to  be  evaluated 
in  order;  this  concept  did  not  occur  in  the  specification  of  our  earlier 
interpreters,  because  it  was  not  necessary  in  the  absence  of  side  effects. 
Similarly,  it  was  not  useful  to  be  able  to  throw  away  values  in  the  absence 
of  side  effects.  (We  did  throw  away  a value  in  driver-loop,  but  that  was  one 
which  resulted  from  calling  print,  which  of  course  is  assumed  to  have  a 
side  effect!)  Using  progn,  George  can  write: 

(0EFINE  (MAPCAR  F L) 

(PROGN  (SETO  N (+  N 1)) 

(OIDMAPCAR  F l))) 


I 


f 


There  remains  the  problem  of  the  global  variable  n,  which  Harry  or  Bruce 
might  stumble  across  by  accident.  George  has  to  have  some  handle  to  get  at 
the  statistics  counter,  and  any  handle  George  can  use  intentionally,  Bruce 
and  Harry  can  use  accidentally.  One  thing  that  George  can  do  is  rename  n 
to  MAPCAR- statistics-counter,  and  warn  Bruce  and  Harry  not  to  use  a global 
variable  with  that  name.  This  is  still  better  than  the  original  situation 
— at  least  now  Bruce  and  Harry  need  not  change  their  programs,  and  it  is 
George's  responsibility  to  find  a name  which  does  not  conflict.  {Note  Can 
George  do  better?} 

In  the  case  of  random,  where  the  state  information  is  truly  local  in 
that  no  one  wants  to  access  it  except  its  owner,  we  can  combine  the  use  of 
lexical  scoping  and  of  side  effects  to  manipulate  a completely  hidden  state 
variable.  For  example,  suppose  we  want  several  independent  pseudo-random 
number  generators,  initialized  with  different  seeds.  We  can  make  a pseudo- 
random number  generator  generator  as  follows: 

(DEFINE  (RGEN  SEED) 

(LAMBDA  ()  ( PROGN  ( SETQ  SEED 

((LAMBDA  (Z)  (CONO  ((>  Z 0)  Z) 

(T  (+  Z -3Z768.  )))) 

(*  SEED  899 . ) ) ) 

SEED))) 

Each  call  to  rgen  delivers  as  its  value  a new  pseudo-random  number 
generator  which  is  an  instance  of  the  prototype  described  by  the  lambda- 
expression  which  is  the  body  of  rgen.  Each  one  has  a state  variable  which 
is  its  seed.  The  state  of  each  instance  is  distinct  from  that  of  every 
other  instance.  This  gives  one  the  power  of  the  own  variables  of  ALGOL  60 
without  any  additional  mechanism. 


Side  Effects  in  the  Interpreter 

In  order  to  write  a simple  interpreter  which  implements  the  side 
effect  setq,  we  will  postulate  the  existence  of  two  side  effect  operators 
which  alter  S-expressions : 

(RPLACA  X Y)  and  (rplacd  X Y) 

return  the  value  of  x (which  must  not  be  atomic),  but  as  a side  effect 

alters  x so  that  its  car  or  edr,  respectively,  is  the  value  of  y.  (The 

introduction  of  operators  which  modify  S-expressions  causes  a number  of 
nasty  problems,  which  we  will  consider  presently.)  We  will  use  these 
operators  to  alter  the  structure  of  the  environment  env.  We  modify  eval  to 
recognize  the  setq  construct  (see  Figure  II).  On  seeing  "setq"  in  the 

"operator  position"  of  the  expression,  eval  dispatches  to  evseto,  after 
recursively  evaluating  the  value  to  be  assigned,  evsetq  uses  lookup  to  find 
the  effective  binding  of  the  variable  mentioned  in  the  setq.  If  there  is 
such  a binding,  rplaca  is  used  to  change  the  value  associated  with  the 
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variable.  If  there  is  no  such  binding,  then  the  intent  is  to  Initialize  a 
top-level  variable;  ev-top-levei-setq  locates  the  top-level  environment 
(which  is  always  at  the  end  of  any  environment)  and  creates  a new  binding 
by  altering  the  environment  structure. 

We  also  modify  eval  to  recognize  progn.  evprogn  is  a tall-recursive 
loop  which  evaluates  each  subform  of  the  progn  form  in  turn,  throwing  away 
each  value  but  the  last.  (Note  progn  Wizardry) 
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(DEFINE  (EVAl  EXP  ENV) 

(COND  ((ATOM  EXP) 

(COND  ((NUMBERP  EXP)  EXP) 

(T  (VALUE  EXP  ENV)))) 

( ( EO  (CAR  EXP)  'QUOTE) 

(CADR  EXP)) 

((EQ  (CAR  EXP)  'LAMBDA) 

(LIST  '&PROCEOURE  (CADR  EXP)  (CADDR  EXP)  ENV)) 

((EQ  (CAR  EXP)  ’SETQ) 

(EVSETQ  (CADR  EXP)  (EVAL  (CAODR  EXP)  ENV)  ENV)) 

((EQ  (CAR  EXP)  ' PROGN ) 

( EVPROGN  (COR  EXP)  ENV  NIL)) 

((EQ  (CAR  EXP)  'COND) 

( EVCOND  (COR  EXP)  ENV)) 

(T  (APPLY  (EVAL  (CAR  EXP)  ENV) 

(EVLIS  (COR  EXP)  ENV))))) 

(DEFINE  (EVSETQ  VAR  VAL  ENV) 

((LAMBDA  (SLOT)  \ 

(COND  ((EQ  SLOT  ' &UNBOUND ) 

(EV- TOP- LEVEL -SETQ  VAR  VAL  ENV)) 

(T  (CAR  (RPLACA  SLOT  VAL))))) 

(LOOKUP  VAR  ENV))) 

(DEFINE  ( EV- TOP -LEVEL -SETQ  VAR  VAL  ENV) 

(COND  ((NULL  (COR  ENV)) 

(CAOAR  (RPLACA  ENV 

(CONS  (CONS  VAR  (CAAR  ENV)) 

(CONS  VAL  (COAR  ENV)))))) 

(T  (EV-TOP-LEVEL-SETQ  VAR  VAL  (COR  ENV))))) 

(DEFINE  (EVPROGN  EXPS  ENV  HUNOZ) 

(COND  ((NULL  (COR  EXPS))  (EVAL  (CAR  EXPS)  ENV)) 

(T  (EVPROGN  (COR  EXPS)  ENV  (EVAL  (CAR  EXPS)  ENV))))) 

For  value,  lookup,  and  bind  see  Figure  3. 

For  evcono  and  evlis  see  Figure  5. 

For  apply  see  Figure  7. 

For  lookup i see  Figure  10  (not  Figure  3). 

Figure  11 

Evaluator  with  User  Side  Effects  (Assignment  to  Variables) 
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Equipotency  of  setq  and  rplaca 

We  pulled  a fast  one  when  we  introduced  rplaca  and  rplaco  for  the 
sake  of  implementing  setq  (though  we  actually  only  used  rpiaca).  We  used  a 
side  effect  to  define  the  implementation  of  side  effects.  While  this  makes 
a fine  meta-circular  description,  it  doesn't  constitute  a definition  of 
side  effects  founded  on  the  original  meta-circular  recursion  equations 
interpreter. 

We  could  implement  an  interpreter  which  would  define  a side  effect 
without  itself  using  side  effects.  Such  a definition  would  encapsulate  the 
entire  state  of  the  user's  data  structures  into  a single  interpreter  data 
structure  which  is  passed  around  by  a top-level  loop.  Constructing  such  an 
interpreter  would  involve  turning  a regular  interpreter  inside  out  (in  much 
the  same  way  gaussian  was  everted  in  (Note  Weber}).  This  is  extremely 
difficult  and  lengthy,  and  the  module  boundaries  within  the  interpreter  are 
sc  destroyed  that  the  resulting  interpreter  is  nearly  impossible  to 
understand.  We  will  spare  the  reader  the  details. 

We  settle  for  a meta-circular  description  of  side  effects.  Now 
that  we  have  seen  how  to  implement  setq  in  terms  of  rplaca  and  rplaco,  we  can 
also  do  the  reverse,  completing  the  meta-circle  (see  Figure  13).  We  use 
the  procedural  version  of  cons  shown  earlier,  modified  to  provide  two 
"setting  procedures"  sa  and  so,  which  provide  the  ability  to  alter  the  car 
and  cdr. 


(DEFINE  (CONS  A D) 

(LAMBOA  (M) 

(h  A 0 (LAMBOA  (Z)  (SETQ  A Z))  (LAMBDA  (Z)  (SETQ  0 Z))))) 

(DEF  INE  (CAR  X) 

(X  (LAMBDA  (A  0 SA  SD)  A)) 

(DEFINE  (COR  X) 

(X  (LAMBOA  (A  0 SA  SO)  0)) 

(OEFINE  (RPLACA  X Y) 

(X  (LAMBOA  (A  0 SA  SO) 

( PROGN  (SA  Y)  X)))) 

(DEFINE  (RPLACO  X Y) 

(X  (LAMBOA  (A  0 SA  SO) 

(PROGN  (SO  Y)  X)))) 

Figure  13 

Procedural  ("Actors-like")  Implementation  of  cons  and  Friends 


We  originally  introduced  side  effects  such  as  seto  to  help  us  build 
modules  such  as  random  which  have  local  state.  Now,  using  the  technique  of 
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constructing  procedures,  we  find  that  cons  can  be  viewed  as  a constructor 
of  modules,  just  ns  mapgen  was.  cons  constructs  modules  ("cons  cells") 
which  use  setq  to  maintain  a local  state. 


Side  Effects  and  Equality 

"Things  are  seldom  what  they  seem, 

Skim  milk  masquerades  as  cream..." 

— Gilbert  and  Sullivan 

(H.M.S.  Pinafore) 

"Plus  qa  change,  plus  c'est  la  m6me  chose." 

— Alphonse  Karr 

Our  descriptions  of  setq  and  rplaca,  both  informal  and  meta- 
circular,  are  imprecise.  They  admit  a number  of  drastically  different 
interpretations  of  the  behavior  of  the  system.  We  would  all  agree  that  for 
hplaca  to  mean  anything  at  all  like  what  we  want,  the  expression: 


((LAMBDA  (X) 

(PROGN  (RPLACA  X ’Z) 
(CAR  X))) 
(CONS  'A  '(B  C))) 

Puzzle  #1 


should  evaluate  to  z.  But  what  about  this  case: 


((LAMBDA  (X  Y) 

(PROGN  (RPLACA  X 

•z) 

(CAR  Y))) 

(CONS 

'A  '(B  C)) 

(CONS 

’A  ’ (B  C))) 

Puzzle 

#2 

Should  this  evaluate  to  a or  z?  Nearly  all  LISP  systems  would  produce  a, 
but  there  are  arguments  for  both  possibilities.  Similarly,  should  this: 


i 

. i 


Steele  and  Sussman 


40 


The  Art  of  the  Interpreter 


( (LAMBDA  (X) 

( ( L AMBOA  (U  V) 

(PROGN  ( RPLACA  U 'Z) 
(CAR  V))) 

X X)) 

(CONS  'A  ’(B  C))) 

Puzzle  #3 


evaluate  to  a or  z?  Again  there  are  arguments  for  both  possibilities. 

Before  we  can  meaningfully  consider  these  questions,  we  must  have  a 
more  precise  notion  of  what  we  mean  by  "rplaca".  Let  us  review  its 
description : 

If  x has  as  its  value  a non-atomic  S-expression , and  we 
evaluate  the  expression  (rplaca  x y),  then  after  this 
evaluation,  the  value  of  the  expression  (car  x)  is  y. 

This  description  depends  upon  a critical  assumption.  We  have  a notion  of  a 
thing  which  is  the  value  of  x,  such  that  several  references  to  the  variable 
x all  refer  to  the  same  thing.  But  what  the  do  wc  mean  by  "same" 77 

The  concept  of  side  effect  is  inseparable  from  the  notion  of 
equality/identity/sameness.  The  only  way  one  can  observationally  determine 
that  a side  effect  has  occurred  is  when  the  same  object  behaves  in  two 
different  ways  at  different  times.  (Note  rplaca  Can  Alter  car  Instead) 
Conversely,  the  only  way  one  can  determine  that  two  objects  are  the  same  is 
to  perform  a side  effect  on  one  and  look  for  an  appropriate  change  in  the 
behavior  of  the  other. 

In  order  to  determine  the  answers  to  the  Puzzles  above,  we  must 
determine  what  properties  are  required  of  "sameness".  There  may  be 
different  points  of  view  regarding  sameness,  which  may  lead  to  different 
answers  to  the  Puzzles. 

If  we  agree  that  the  answer  to  Puzzle  01  is  z,  then  we  have 
implicitly  adopted  the  notion  of  consistency  of  variable  reference,  because 
we  have  referred  to  the  variable  x twice.  As  a property  of  the  sameness 
predicate  s,  we  write:  ( = x x ) . We  can  say  that  referring  to  a variable 
does  not  make  a copy  of  its  value  (because  if  it  did,  the  rplaca  in 
Puzzle  #1  would  have  changed  only  a copy  of  the  value  of  x,  and  (car  x) 
would  extract  the  car  of  a different  copy,  producing  a). 

Given  this,  and  given  that  wc  accept  the  interpreter  of  Figure  11 
and  believe  in  its  meta-circularity,  we  are  forced  to  conclude  that  the 
answer  to  Puzzle  #3  is  also  z.  We  must  consider  all  access  paths  and  show 
that  no  copying  can  occur  which  would  allow  the  answer  to  be  a.  The  meta- 
circularity requires  that  any  property  of  the  interpreted  language  also 
hold  for  the  text  of  the  interpreter,  and  vice  versa.  The  answer  to 
Puzzle  #1  requires  that  variable  references  not  produce  implicit  copies, 
and  so  neither  can  variable  references  in  the  text  of  the  interpreter. 
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(Consistent  with  this,  our  particular  interpreter  has  no  explicit  code  in 
lookup  which  specifies  copying.)  The  other  place  in  Puzzle  #3  where  copying 
might  occur  is  in  the  binding  of  u and  v.  Examining  the  text  of  our 
particular  meta-circular  interpreter  shows  that  bind  also  has  no  explicit 
code  for  copying.  There  remains  the  possibility  that  binding  does 
implicitly  copy  in  the  text  of  the  meta-circular  interpreter;  this  would 
consistently  cause  copying  in  the  bindings  of  the  interpreted  code,  because 
env  would  be  copied  whenever  bound  in  the  text  of  the  interpreter.  This, 
however,  would  cause  the  answer  to  Puzzle  #1  to  be  a,  because  env  is  bound 
at  other  places  which  would  cause  incorrect  copying.  We  therefore  conclude 
that  no  implicit  copying  can  occur,  and  so  the  answer  to  Puzzle  #3  is  z. 

We  emphasize  that  this  result  rests  on  our  acceptance  of  a 
particular  class  of  meta-circular  interpreters.  (These  interpreters, 
however,  closely  model  what  real  LISP  systems  do.)  There  are  other 
languages  which  do  implicitly  copy  structured  values  when  binding 
variables,  such  as  Algol  60  when  using  call-by-value.  For  such  a language, 
the  answer  to  Puzzle  #3  would  be  a (if  we  represented  the  list  (a  b c)  as  an 
Algol  60  array,  for  example),  even  though  the  answer  to  Puzzle  #1  would 
still  be  z. 

One  can  argue  both  for  and  against  copying  during  binding  on  the 
basis  of  modularity.  Copying  isolates  the  caller  from  the  called  routine 
by  preventing  the  called  routine  from  performing  under-the-table  side- 
effects  on  the  caller's  data  objects.  Not  copying  allows  data  objects  to 
encapsulate  independent  pieces  of  state  which  can  be  operated  on  by  low- 
level  routines  whose  details  need  not  be  understood  by  their  caller  (an 
example  of  such  a data  object  is  the  symbol  table  of  an  assembler,  with  its 
insertion  and  lookup  routines). 

We  now  consider  Puzzle  # 2.  If  we  accept  that  binding  and  variable 
referencing  do  not  makes  copies,  then  Puzzle  #2  is  a question  about  the 
nature  of  cons:  if  cons  is  called  twice  with  arguments  which  are  the  same, 
are  the  two  results  the  same?  (Note  that  this  is  the  inverse  of  Postulate 
4 for  8-expressions  in  (Note  5-expression  Postulates  and  Notation}.)  If 
the  answer  is  consistently  a (as  in  most  real  LISP  systems),  then  cons  must 
generate  a new  object  every  time  it  is  called.  (It  must  produce  different 
results  if  the  two  sets  of  arguments  differ,  and  an  answer  of  a to 
Puzzle  #2  requires  different  results  if  the  two  sets  of  arguments  are  the 
same.)  cons  perforce  contains  a side  effect.  Calls  to  it  are  not 
referentially  transparent. 

The  other  possibility,  given  that  variable  binding  and  variable 
referencing  do  not  make  copies,  is  that  the  answer  to  Puzzle  #2  is  z.  In 
this  case,  cons  of  the  same  arguments  must  always  produce  the  same  result. 
This  choice  leads  to  galloping  non-modularity  of  data  structures  without 
compensation.  Suppose,  for  example,  we  represent  arrays  as  lists  of 
numbers  (a  reasonable  LISP  representation),  and  want  to  alter  the  last 
element  of  one  such  array  (using  rplaca).  Under  this  scheme,  all  arrays 
whatsoever  with  the  some  last  element  would  be  magically  altered!  A 
language  with  such  characteristics  would  be  extremely  difficult  to  control. 

Supposing  now  that  binding  does  make  copies  as  in  Algol  60,  the 
answer  to  Puzzle  #2  must  be  a.  Here  it  does  not  matter  whether  cons  of  the 
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same  arguments  produces  the  same  result,  since  the  bindings  of  x and  Y will 
make  copies  anyway.  We  may,  however,  consider  this  variant: 


(PR0GN  (RPLACA  (CONS  'A  '(B  C))  '!) 
(CAR  (CONS  'A  '<»  C ) ) ) ) 

Puzzle  #Za 


Mere  we  have  simply  substituted  the  expressions  (cons  'A  '(b  c))  for 
the  occurrences  of  x and  y.  If  cons  always  returns  the  same  object  for  the 
same  inputs,  then  Puzzle  #2  and  Puzzle  #2a  have  different  answers  if 
bindings  copy,  but  may  have  the  same  answers  if  bindings  do  not  copy  (they 
may  not  have  the  same  answer  if  cons  notices  that  we  have  pulled  the  rug 
out  from  under  it  and  produces  a new  version  because  the  old  one  was 
changed!).  There  is  also  a quibble  as  to  whether  the  passing  of  an 
argument  to  rplaca  in  itself  constitutes  a binding  — if  so,  rplaca  must  be 
completely  ineffectual,  because  it  always  receives  a copy!  We  must  then 
regard  rplaca  as  a built-in  system  primitive;  the  user  would  have  no  way  to 
define  such  a thing.  This  would  be  most  unfortunate. 

We  have  examined  many  of  the  design  decisions  for  the  meaning  of 
rplaca,  cons,  and  equality.  If  side  effects  are  to  be  usable  at  all,  the 
references  to  things  denoted  by  variables  must  not  make  copies  of  those 
things.  If  the  user  is  to  be  able  to  write  procedures  which  produce 

lasting  side  effects  on  their  arguments  (as  system-supplied  primitive 

operators  do),  then  there  must  be  a variable  binding  mechanism  which  does 
not  make  copies.  (LISP's  binding  mechanism  in  fact  does  not  copy.  Algol 
60 ' s call-by-value  mechanism  does  copy  structured  data,  but  its  call-by- 
name mechanism  does  not;  we  will  study  this  in  Part  Three.)  If  the 
variable  binding  (or  assignment)  mechanism  does  not  make  copies,  then  cons 
must  generate  a new,  distinct  object  on  each  call. 

The  reader  may  have  noted  that  we  have  been  talking  in  circles  for 
the  last  several  paragraphs:  in  attempting  to  elucidate  the  meaning  of 
sameness,  we  have  discussed  side  effects,  and  in  so  doing  used  the  word 

"same"  nearly  every  other  sentence.  The  point  is  that  it  is  not  possible 

to  define  them  separately;  The  meanings  of  "equality"  and  "side  effect" 

simultaneously  constrain  each  other.  With  this  in  mind,  we  will 

investigate  the  choice  of  a primitive  equality  predicate. 

The  equality  predicate  we  choose  should  be  sufficiently  finely 
grained  to  distinguish  any  two  objects  which  have  potentially  distinct 

behavior,  yet  should  not  be  so  finely  grained  as  to  distinguish  entitles 

which  otherwise  would  have  the  same  behavior.  Thus  we  have  two  desiderata: 

[1]  Two  objects  which  are  observed  to  behave 
differently  must  not  be  equal. 


[2]  Conversely,  we  would  like  two  objects  which  are 
adjudged  unequal  to  exhibit  differing  behaviors  under 
suitable  circumstances. 
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Any  useful  equality  predicate  must  satisfy  [1].  Unfortunately,  satisfying 
[Z]  also  may  be  too  difficult;  the  equivalence  of  behavior  for  procedural 
objects  is  an  unsolvable  problem.  We  are  thus  forced  to  settle  for  an 
equality  predicate  which  may  make  more  distinctions  than  are  strictly 
necessary. 

LISP  has  two  standard  equality  predicates:  equal  and  eq.  We 
exhibited  a definition  of  equal  in  Part  Zero.  In  Part  Zero  we  also  gave  a 
description  of  eq,  but  defined  it  only  on  atoms;  LISP  usually  extends  eq 
to  all  S-expressions  in  such  a way  as  to  distinguish  the  results  of 
different  calls  to  cons  (regardless  of  the  arguments  given  to  cons). 
Variable  references  and  variable  binding  "preserve  Eoncss". 

In  the  absence  of  rplaca  ("pure  LISP"),  eq  and  equal  both  satisfy 
desideratum  [1].  equal,  however,  makes  fewer  unnecessary  distinctions  than 
eq.  By  desideratum  [2],  equal  is  therefore  preferred  to  eq.  (The  technique 
of  "hash-consing"  [Goto]  can  be  used  in  this  situation  to  make  eq  and  equal 
effectively  the  same.) 

In  the  presence  of  side  effects  such  as  rplaca,  equal  fails  to  make 
sufficiently  many  distinctions.  Each  call  to  cons  produces  distinct 
objects,  which  equal  may  fail  to  distinguish.  In  this  case,  equal  fails 
desideratum  [1].  Thus,  in  the  presence  of  rplaca,  eq  is  the  preferred 
equality  predicate. 

In  summary,  indeed  "the  more  things  change,  the  more  they  remain 
the  same".  Two  distinct  objects  may  look  the  same  because  one  masquerades 
as  the  other;  they  can  be  operationally  distinguished  only  by  purposely 
altering  the  behavior  of  just  one  of  them.  Thus  the  ability  to  decide 
whether  two  objects  are  the  same  is  directly  correlated  with  the  ability  to 
perform  side  effects  on  them. 


Dynamic  Scoping  as  a State-Decomposition  Discipline 

As  we  saw  in  the  preceding  section,  side  effects  can  become  rather 
complicated.  To  help  keep  this  complexity  under  control,  we  ought  to 
abstract  and  package  common  patterns  of  their  use. 

Suppose  we  have  a procedure  print-number  which  prints  numbers: 

(DEFINE  (PRINT-NUMBER  N) 

((LAMBDA  (Q  R) 

(CONO  ( ( ZEROP  Q)  (PRINT-DIGIT  R)) 

(T  ( PROGN  (PRINT-NUMBER  Q) 

(PRINT-DIGIT  R))))) 

(/  N 10.) 

(REMAINDER  N 10.  ))) 

Now  people  find  this  program  very  useful  and  use  it  in  all  their  programs, 
Normally  we  want  to  print  numbers  in  radix  10  (decimal),  but 
occasionally  (for  example,  in  a debugging  aid)  we  want  to  print  numbers  in 
other  radices,  such  as  8 or  16.  One  might  generalize  the  print-number 

program  to  take  the  radix  as  an  extra  argument: 
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(DEFINE  (PRINT- NUMBER  N RADIX) 

((LAMBDA  (Q  R) 

. (COND  ((ZEROP  0)  (PRINT-DIGIT  R)) 

(T  ( PROGN  (PRINT-NUMBER  0) 

(PRINT-DIGIT  R ) > > ) ) 

(/  N RADIX) 

(REMAINDER  N RADIX))) 

Of  course,  then  everyone  who  uses  print-number  must  supply  the  radix.  This 
is  mildly  annoying,  because  most  of  the  time  one  wants  decimal  printing, 
and  one  tires  of  writing  "io.M  all  the  time.  One  might  write  another 

program  for  most  people  to  use: 

(OEFINE  (PRINT-10  N) 

(PRINT-NUMBER  N 10.  )) 

This  example  is  simple,  but  a real  print  procedure  in  a real  LISP  system 
may  be  controlled  by  dozens  of  parameters  like  radix:  format  parameters 

for  printing  floating-point  numbers,  which  file  to  print  to,  file-dependent 
format  parameters  such  as  line  width  and  page  length,  file-dependent 

processing  routines  (e.g.  scrolling  for  display  terminals),  abbreviation 
format  parameters  for  S-expressions,  etc.  All  these  extra  parameters  to 
print  are  really  determined  by  the  larger  context  in  which  print  is  used, 
but  this  context  is  usually  not  determined  by  the  immediate  caller  of 
print.  A program  which  generates  and  prints  successive  prime  numbers 

should  not  have  to  deal  with  the  complexities  of  output  files;  in 
particular,  one  does  not  want  to  have  to  rewrite  the  program  just  to  direct 
the  output  to  a line  printer  instead  of  a disk  file.  Context  decisions  are 
usually  made  at  a much  higher  level  (perhaps  interactively  by  the  user). 
Therefore  the  solution  of  using  procedures  like  print-io  is  not  acceptable; 
such  procedures  only  serve  as  abbreviations,  binding  the  many  parameters  to 
constants  at  too  low  a decision  level. 

Another  idea  is  to  pass  the  extra  parameters  for  print  control 
through  the  intermediate  levels  of  the  program.  But  this  violates  the 
modularity  of  the  intermediate  modules,  which  generally  have  no  interest  in 
print’s  screwy  parameters.  On  the  other  hand,  an  occasional  intermediate 
module  will  be  interested  in  dealing  with  a few  of  the  parameters  (but 
probably  not  all  of  them!).  We  would  like  a mechanism  for  dealing  with 
only  the  parameters  of  interest,  without  having  to  deal  with  all  of  them 
all  of  the  time. 

Side  effects  can  do  the  job.  We  can  make  all  the  parameters 
globally  available  variables  (in  the  top-level  environment),  initialized  to 
reasonable  default  values,  and  invite  all  interested  parties  to  perform 
seto  ns  necessary.  This  technique  has  disadvantages.  If  every  program 
just  changes  the  parameters  at  will,  then  each  program  must  re-set  all  th® 
parameters  (even  the  ones  not  of  interest)  for  its  own  uses  of  print.  This 
is  even  worse  than  just  passing  print  all  the  parameters! 

We  can  require  a convention  whereby  the  parameters  normally  have 
their  initial  default  values,  and  any  program  which  modifies  a parameter 
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must  eventually  restore  it  to  its  previous  value..  For  example,  a procedure 
to  print  in  octal  might  look  like: 

(DEFINE  (PRINT-8  N) 

((LAMBDA  (01DPADIX) 

( PROGN  ( SE  TO  RADIX  8) 

(PRINT-NUMBER  N) 

(SETQ  RADIX  OLORAOIX))) 

RADIX)) 

This  convention  allows  print-b  to  locally  alter  the  radix,  in  a manner 
transparent  to  its  caller;  it  does  not  interfere  with  the  way  in  which  its 
caller  may  be  using  print. 

This  convention  is  a standard  pattern  of  use.  It  is  a stack 
I discipline  on  the  values  of  radix  (or  whatever  other  variables).  We  would 

like  to  capture  this  pattern  as  an  abstraction  in  our  language. 

Surprise!  We  have  seen  this  abstraction  before:  dynamically 
scoped  variables  behave  in  precisely  this  way.  Dynamically  scoped 
variables  conceptually  have  a built-in  side  effect  — we  took  advantage  of 
this  at  the  end  of  Part  One  to  fix  the  problem  with  the  top-level  loop. 
Binding  a dynamically  scoped  variable  such  as  radix  can  be  said  to  cause  a 
side  effect  because  it  alters  the  behavior  of  a (superficially)  unrelated 
! procedure  such  as  print  in  a referentially  opaque  manner.  Such  binding  is 

a particularly  structured  kind  of  side  effect,  because  it  guarantees  that 
the  side  effect  will  be  properly  undone  when  the  binder  has  finished 
executing.  Thus  with  dynamic  scoping  we  could  write: 

(0EFJNE  (PRINT-8  N) 

| ((LAMBDA  (RADIX) 

(PRINT-NUMBER  N)) 

8)) 

We  saw  in  Part  One  that,  precisely  because  dynamically  scoped 
variables  arc  referentially  opaque,  we  do  not  want  all  variables  to  be 
dynamically  scoped.  But  we  have  newly  rediscovered  dynamic  variables  in 
another  context  and  found  them  desirable.  We  therefore  consider  an  inter- 
preter which  supplies  both  lexical  and  dynamic  variables  (see  Figure  14). 

Here  we  have  merged  the  dynamically  scoped  variable  evaluator 
(Figure  5)  with  the  lexically  scoped  evaluator  (Figure  11).  We  changed 
apply  to  have  an  extra  case,  wherein  an  "open  LAMBDA-expression"  is 
effectively  closed  at  the  time  of  its  application  using  the  environment  of 
its  caller,  eval  is  changed  to  once  again  supply  the  environment  to  apply. 
This  interpreter  is  almost  identical  to  that  of  LISP  1.5  [LISP  1.5M],  with 
the  difference  that  we  write  simply  (lambda  ...)  to  get  a closed  procedure 
where  in  LISP  1.5  one  must  write  (function  (lambda  ...));  in  both  cases  one 
must  write  '(lambda  ...)  to  get  an  open  LAMBDA-expression . 
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(DEFINE  (EVAL  EXP  ENV) 

(COND  ((ATOM  EXP) 

(CONO  ( ( NUMBER?  EXP)  EXP) 

(T  (VALUE  EXP  ENV)))) 

((EO  (CAR  EXP)  ’QUOTE) 

(CAOR  EXP)) 

( (EO  (CAR  EXP)  ’LAMBDA) 

(LIST  'APROCEDURE  (CADR  EXP)  (CAOOR  EXP)  ENV)) 

((EO  (CAR  EXP)  ’SETQ) 

( EVSETQ  (CADR  EXP)  (EVAL  (CAOOR  EXP)  ENV)  ENV)) 

((EO  (CAR  EXP)  ' PROGN ) 

(EVPROGN  (COR  EXP)  ENV  NIL)) 

((EO  (CAR  EXP)  ’CONO) 

( EVCOND  (COR  EXP)  ENV)) 

(T  (APPLY  (EVAL  (CAR  EXP)  ENV) 

(EVL1S  (COR  EXP)  ENV) 

ENV)))) 

(DEFINE  (APPLY  FUN  ARGS  ENV) 

(CONO  ((PRIMOP  FUN)  ( PR IMOP- APPLY  FUN  ARGS)) 

((EO  (CAR  FUN)  ’APROCEDURE) 

(EVAL  (CAOOR  FUN) 

(BIND  (CADR  FUN)  ARGS  (CADOOR  FUN)))) 

((EO  (CAR  FUN)  ’LAMBDA) 

(EVAL  (CAOOR  FUN) 

(BIND  (CADR  FUN)  ARGS  ENV))) 

(T  (ERROR)))) 

For  value,  lookup,  and  bind  see  Figure  3. 

For  evcond  and  evlis  see  see  Figure  5. 

For  lookup  1 see  Figure  10  (not  Figure  3). 

Figure  14 

Interpreter  with  Both  Open  and  Closed  Procedures 


Although  this  is  the  tradition,  it  doesn't  work  very  well.  The 
problem  is  that  the  lexical  variables  are  not  really  lexical.  Although 
lexical  references  cannot  incorrectly  refer  to  dynamically  intended 
bindings,  the  reverse  is  not  true.  Dynamic  variable  references  can  be 
captured  by  bindings  intended  to  be  strictly  lexical. 

For  example,  we  might  want  to  write  a procedure  which  packages  up 
Information  about  dealing  with  radix: 


(DEFINE  (RADIX-10  FUN) 

((LAMBOA  (RADIX)  (FUN)) 

10.)) 
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This  is  more  general  than  print-io  in  that  it  allows  us  to  wrap  a binding  of 
radix  around  any  piece  of  code,  not  just  a call  to  print.  (In  a more 
realistic  example,  we  might  package  up  the  bindings  of  a dozen  parameters 
in  a similar  manner.) 

There  are  two  possibilities:  should  the  argument  to  raoik-io  be  a 
closed  procedure  or  an  open  lAMBDA-expression?  If  closed: 

(DEFINE  (OO-SOMETHING-INTERESTING  X FUN) 

(RADIX-10  (LAMBDA  ()  (FORMAT-HAIR  'FOO  (CADR  X)  FUN)))) 

(format-hair  takes  several  arguments,  one  of  them  a procedure  and  presumably 
calls  print  at  some  level),  then  the  binding  of  radix  in  radix-io  will  not  be 
apparent  to  print,  because  the  environment  of  the  call  to  format-hair  is  that 
of  the  closed  procedure,  which  in  turn  is  that  of  the  call  to  radix-io 
within  oo-something-interesting.  Thus  it  fails  to  work  at  all.  If  the 
argument  to  radix-io  is  left  open: 

(DEFINE  (DO-SOMETHING-INTERESTING  X FUN) 

( RAO IX- 10  '(LAMBDA  ()  (FORMAT-HAIR  'FOO  (CADR  X)  FUN)))) 

then  this  fails  to  work  at  all  because  of  a variable  naming  conflict  with 
fun.  The  third  argument  passed  to  format-hair  will  evaluate  to  the  argument 
which  was  passed  to  raojx-io,  namely  the  quoted  lambda  expression.  This  is 
similar  to  the  mapcar  bug  that  originally  got  us  thinking  about  lexical 
scoping  in  Part  One. 

A solution  to  this  problem  is  to  maintain  separate  environments  for 
lexical  and  dynamic  variables;  this  will  guarantee  that  the  two  kinds 
cannot  interfere  with  each  other.  This  will  require  a special  syntax  for 
distinguishing  references  to  and  bindings  of  the  two  kinds  of  variables. 
We  will  choose  to  encode  lexical  variables  as  atomic  symbols,  as  before, 
and  dynamic  variables  as  lists  of  the  form  (dynamic  x),  where  x is  the  name 
of  the  dynamic  variable.  (This  choice  is  completely  arbitrary.  We  could 
have  chosen  to  encode  the  two  kinds  as  (lexical  x)  and  x;  or  as  (lexical  x) 
and  (dynamic  x),  leaving  atomic  symbols  as  such  free  to  encode  yet  something 
else;  but  we  have  chosen  this  because  in  practice  most  variable 
references,  even  in  a purely  dynamically  scoped  LISP,  are  lexical,  or  can 
be  considered  so . ) 

In  our  new  interpreter  (see  Figure  15)  we  call  the  two  environments 
env  (lexical)  and  denv  (dynamic).  The  syntax  of  LAMBDA-expressions  is 
extended  to  accommodate  two  kinds  of  bindings;  for  example, 

(LAMB0A  (X  Y (DYNAMIC  Z)  V)  ...) 

takes  four  arguments,  and  binds  the  parameters  x,  y,  and  w lexically,  and  z 
dynamically.  Using  this  syntax,  we  could  write  radix-io  in  this  way: 

(DEFINE  ( RAO IX- 10  FUN) 

((LAMBDA  ((DYNAMIC  RADIX))  (FUN)) 

10.)) 


The  Art  of  the  Interpreter 


Steele  and  Sussman 


46 


The  code  for  print-number  would  then  be  written: 

(DEFINE  (PRINT-NUMBER  N) 

((LAMBDA  (Q  R) 

(COND  ((ZEROP  Q)  (PRINT-DIGIT  R)) 

(T  (PROGN  (PRINT-NUMBER  0) 

(PRINT-DIGIT  R))))) 

(/  N (DYNAMIC  RADIX)) 

(REMAINDER  N (DYNAMIC  RADIX)))) 

Most  of  the  extra  complexity  in  Figure  15  is  devoted  to  the  parsing  of 
LAMBDA-expression  binding  lists  upon  application  by  apply- procedure . (For  the 
sake  of  brevity  we  have  omitted  the  parts  of  the  interpreter  which  deal 
with  seto  and  progn;  they  could  easily  be  re-inserted.) 


i 
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(DEFINE  (EVAL  EXP  ENV  OENV) 

(CONO  ((ATOM  EXP) 

(COND  ((NUMBERP  EXP)  EXP) 

(T  (VALUE  EXP  ENV)))) 

((EQ  (CAR  EXP)  'QUOTE ) (CADR  EXP)) 

<(EQ  (CAR  EXP)  'LAMBOA) 

(LIST  ' &PROCEOURE  (CADR  EXP)  (CADOR  EXP)  ENV)) 
((EQ  (CAR  EXP)  ’COND) 

(EVCONO  (COR  EXP)  ENV  DENV) ) 

((EQ  (CAR  EXP)  ’DYNAMIC)  (VALUE  (CADR  EXP)  OENV)) 
(T  (APPLY  (EVAL  (CAR  EXP)  ENV  DENV) 

(EVLIS  (COR  EXP)  ENV  DENV) 

OENV)))) 


(OEF1NE  (APPLY  FUN  ARGS  DENV) 

(COND  ((PRIMOP  FUN)  ( PR IMOP -APPLY  FUN  ARGS  DENV)) 

((EQ  (CAR  FUN)  ' ^PROCEDURE ) 

(APPLY-PROCEOURE  (CADR  FUN)  ARGS  '()  ’()  '()  ’() 
(CADODR  FUN)  DENV  (CADOR  FUN))) 

(T  (ERROR)))) 


(DEFINE  (APPLY-PROCEOURE  VARS  ARGS  LVARS  LARGS  OVARS  DARGS  ENV  OENV  BODY) 

(COND  ((NULL  VARS) 

(CONO  ((NULL  ARGS) 

(EVAL  BODY 

(BIND  LVARS  LARGS  ENV) 

(BIND  OVARS  DARGS  DENV))) 

(T  (ERROR)))) 

((NULL  ARGS)  (ERROR)) 

((ATOM  (CAR  VARS)) 

(APPLY-PROCEDURE  (COR  VARS)  (COR  ARGS) 

(CONS  (CAR  VARS)  LVARS)  (CONS  (CAR  ARGS)  LARGS) 


OVARS  DARGS 


ENV  OENV  BODY)) 

((EQ  (CAAR  VARS)  ’DYNAMIC) 
(APPLY-PROCEOURE  (COR  VARS)  (COR  ARGS) 


LVARS  LARGS 


(CONS  (CAR  VARS)  OVARS)  (CONS  (CAR  ARGS)  DARGS) 
ENV  OENV  BODY)) 


(T  (ERROR)))) 


For  evcono  and  evlis  see  Figure  2. 

For  value,  bind,  and  lookup  see  Figure  3. 

For  lookupi  see  Figure  10. 

Figure  15 

Interpreter  with  Separate  Lexical  and  Dynamic  Variables 
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Dynamic  scoping  provides  an  important  abstraction  for  dealing  with 
side  effects  in  a controlled  way.  A low-level  procedure  may  have  state 
variables  which  are  not  of  interest  to  intermediate  routines,  but  which 
must  be  controlled  at  a high  level.  Dynamic  scoping  allows  any  procedure 
to  get  access  to  parts  of  the  state  when  necessary,  but  permits  most 
procedures  to  ignore  the  existence  of  the  state  variables.  The  existence 
of  many  dynamic  variables  permits  the  decomposition  of  the  state  in  such  a 
way  that  only  the  part  of  interest  need  be  dealt  with. 

If  dynamic  variables  are  integrated  with  the  lexical  environment, 
intractable  dilemmas  are  encountered.  (We  have  not  considered  here  all 
possible  such  integration  schemes,  but  the  authors  have  found  such 
difficulties  with  every  such  scheme  they  have  examined.)  We  have  therefore 
presented  an  interpreter  in  which  environments  for  the  two  kinds  of 
variable  are  separated. 


I 


I 
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Summary 


We  examined  the  effects  of  various  language  design  decisions  on  the 
programming  styles  available  to  a user  of  the  language,  with  particular 
emphasis  on  the  ability  to  incrementally  construct  modular  systems.  At 
each  step  we  exhibited  an  interactive  meta-circular  interpreter  for  the 
language  under  consideration.  Each  new  interpreter  was  the  result  of  an 
incremental  change  to  the  previous  interpreter. 

We  started  with  a simple  interpreter  for  LISP  recursion  equations. 
In  order  to  capture  certain  abstractions  we  were  forced  to  introduce 
procedural  data.  This  in  turn  forced  consideration  of  the  meanings  of  free 
variables  in  a procedure,  for  the  simplest  extension  unexpectedly 
introduced  dynamic  scoping  of  variables. 

We  were  compelled  to  turn  from  dynamic  scoping  to  lexical  scoping 
to  preserve  the  integrity  of  procedural  abstractions.  The  referentially 
transparent  language  thus  obtained  is  richer  than  expected.  It  allows  the 
definition  of  procedures  which  construct  other  procedures  by  instantiation 
of  a prototype.  Unfortunately,  we  found  that  complete  referential 
transparency  in  a language  makes  it  impossible  to  construct  an  interactive 
interface  to  the  interpreter.  But  such  an  interface  is  necessary  to 
satisfy  another  requirement  of  modular  construction:  that  parts  of  a 
program  can  be  independently  defined,  replaced,  and  debugged.  We  were 
forced  to  give  up  absolute  referential  transparency  to  admit  an  interactive 
interface . 

The  problems  of  the  interactive  interface  led  us  to  consider  the 
notion  of  state  as  a dimension  of  abstraction.  Just  as  we  didn't  want  to 
have  textually  monolithic  programs,  we  wanted  to  avoid  programs  which 
manipulate  a monolithic  representation  of  the  state.  The  decomposition  of 
the  state  of  a system  into  several  independent  parts  induces  the  notion  of 
a side  effect.  Side  effects  only  make  sense  relative  to  a definition  of 
equality  on  the  space  of  data  objects.  But  the  definition  of  equality 
itself  depends  simultaneously  on  the  notion  of  side  effect.  Only  a few  of 
the  choices  of  equality  predicate  and  side  effect  notion  are  consistent 
with  the  requirements  of  modular  construction. 

The  introduction  of  side  effects  is  inconsistent  with  referential 
transparency.  But  since  both  are  important  to  support  modular  construction 
we  must  accept  an  engineering  trade-off  between  them.  We  were  led  to  look 
for  controlled  patterns  of  side  effects  which  can  be  easily  understood  and 
safely  applied.  We  discovered  that  one  such  pattern  is  equivalent  to  the 
use  of  dynamically  scoped  variables  we  discussed  earlier.  We  investigated 
how  to  construct  a system  which  integrates  lexical  and  dynamic  scoping  in  a 
smooth  way. 

There  are  many  issues  yet  to  be  explored.  The  introduction  of  side 
effects  raises  questions  about  order  of  evaluation.  An  interesting  order 
provided  by  Algol  GO  is  call-by-name.  This  discipline,  so  unlike  LISP's, 
is  induced  from  a different  notion  of  procedure,  expressed  as  the  "copy 
rule".  This  idea  is  a syntactic  one,  and  so  differs  in  flavor  from  the 
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procedural  ideas  embodied  by  the  interpreters  we  have  presented. 
Consideration  of  syntactic  transformations  leads  to  the  notion  of  meta- 
procedures, such  as  macros,  compilers,  and  simplifiers.  We  will  explore 
all  of  this  in  Parts  Three  and  Four. 
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Notes 


{Can  George  do  better?}  Page  34 

The  problem  here  is  that  George  needs  access  to  the  statistics 
counter  without  giving  that  access  to  anyone  else.  As  described  in  the 
next  example  George  can  make  the  counter  an  own  variable,  but  how  can  ho 
get  access  to  it?  One  idea  is  that  George  can  define  mapcar  in  the 
following  manner: 


((LAMBDA  (N) 

(PROGN  (St TQ  MAPCAR 

(LAMBDA  (F  L) 

(PROGN  (SETQ  N (+  N 1)) 

(OLOMAPCAR  F L)))) 


0) 


(LAMBDA  ( ) N))) 


This  expression  defines  mapcar  by  stToing  (See  {Note  Driver  Loop  with  Side 
Effects}.)  it  to  an  appropriate  procedure.  It  then  returns,  as  a value, 
an  anonymous  procedure  which  accesses  the  value  of  the  statistics  counter. 
If  George  saves  this  value  and  uses  it  to  get  at  the  counter  when  he  needs 
it,  he  will  have  isolated  it  completely  from  everyone  else! 


{Debugging}  Page  27 

It  has  been  suggested  that  it  is  possible  always  to  write  correct 
programs.  Such  a situation  would  eliminate  the  need  for  debugging.  The 
problem  with  this  idea  is  that  a crucial  part  of  the  problem-solving 
strategy  is  the  decomposition  of  problems  into  presumably  independent 
subproblcms.  There  is  no  guarantee  that  this  is  possible  in  general,  but 
even  when  it  is  not  possible,  there  are  often  general  strategies  for 
approximating  a solution  to  a problem  by  composing  the  solutions  to  almost 
independent'  subproblems.  Often  one  can  make  progress  on  the  solution  to  a 
hard  problem  by  considering  the  solution  of  a simplified  version  of  the 
problem  which  is  similar  in  some  essential  aspect  to  the  original  one  but 
which  differs  from  it  in  detail.  Once  the  solutions  to  the  subproblems  are 
obtained,  they  must  be  fitted  together,  and  the  details  of  the  interactions 
smoothed  out.  The  fixing  of  unanticipated  interactions  is  debugging. 

Even  in  those  cases  where  a decomposition  into  completely 
independent  subproblems  is  possible,  it  is  not  always  feasible.  In  order 
to  be  sure  that  the  solutions  to  the  subproblcms  are  really  independent  it 
is  necessary  to  understand  both  the  problem  and  the  possible 
implementations  and  interactions  of  subsolutions  so  completely  that  one 
must  effectively  solve  the  entire  problem  before  choosing  the  correct 
decomposition.  This  compromises  the  decomposition  strategy. 
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{Driver  Loop  with  Side  Effects}  Pages  37,  53,  59 

This  driver  loop  (Figure  Nl)  is  similar  to  the  one  in  Figure  8 
(which  didn't  work).  This  one  does  work  because,  although  top-level 
procedure  definitions  are  closed  in  the  current  top-level  environment,  that 
environment  is  changed  using  a side  effect  when  new  definitions  are  made. 


(DEFINE  (DRIVER-LOOP- 1 ENV  FORM) 

(CONO  ((ATOM  FORM) 

(DRIVER-LOOP  ENV  NIL  (PRINT  (EVAL  FORM  ENV)))) 

((EQ  (CAR  FORM)  'DEFINE) 

(DRIVER-LOOP  ENV 

(EVSETQ  (CAADR  FORM) 

(LIST  ’^PROCEDURE 
(COADR  FORM) 

(CAOOR  FORM) 

ENV) 

ENV) 

(PRINT  (CAADR  FORM)))) 

(T  (ORIVER-LOOP  ENV  NIL  (PRINT  (EVAL  FORM  ENV)))))) 

For  eval  and  evsetq  see  Figure  11. 

For  lookup!  see  Figure  3 (not  Figure  10,  despite  Figure  11!). 

Figure  Nl 

Implementation  of  driver-loop  Using  Side  Effects 
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The  top  level  of  LISP  1 [LISP  1M]  and  LISP  1.5  [LISP  1.5M]  actually 
was  not  at  all  like  the  one  presented  here.  Rather  than  reading  one  S- 
expression  and  giving  it  to  eval,  it  read  two  S-expressions  and  gave  them 
to  apply.  Such  a top  level  is  called  an  evalouote  top  level  (see  Figure  N2). 


(DEFINE  (DRIVER-LOOP-1  PROCEDURES  FORM ) 

(ORIVER-LOOP-2  PROCEDURES  FORM  (READ))) 

(DEFINE  (DRIVER-LOOP-2  PROCEDURES  F0RM1  F0RM2) 

(COND  ( ( EO  F0RM1  ’DEFINE) 

(DRIVER-LOOP  (BIND  (LIST  (CAAR  F0RM2 ) ) 

(LIST  (LIST  '&PR0CE0URE  (COAR  FORM2)  (CADR  F0RM2 ) ) ) 
PROCEDURES) 

(PRINT  (CAAR  F0RM2 ) ) ) ) 

(T  (DRIVER-LOOP  PROCEDURES 

(PRINT  (APPLY  FORM1  FORM  PROCEDURES)))))) 

For  driver-loop  see  Figure  1. 

For  apply  see  Figure  2. 

For  bind  see  Figure  3. 

Figure  NZ 

Driver  Loop  for  an  evalouote  Top  Level 


This  driver  loop  is  somewhat  nicer  than  the  one  in  Figure  1, 
because  the  one  in  Figure  1 had  an  essentially  useless  cond  clause.  The 
case  of  typing  an  atom  was  not  useful,  because  there  were  no  top-level 
values  for  variables.  Once  we  introduce  procedural  objects,  this  is  no 
longer  true.  But  evalouote  requires  an  inconsistency  of  notation:  at  the 
top  level  one  must  write  car((a  . B)),  whereas  in  the  middle  of  a program 

one  would  write  (car  *(a  . b)). 

The  notion  of  evalouote  also  has  some  theoretical  motivation,  if  one 
thinks  of  LISP  as  a universal  machine  akin  to  a universal  Turing  machine. 
In  this  model  one  takes  a description  of  a machine  to  be  simulated  and  a 
description  of  its  input  data,  and  gives  them  to  the  universal  machine  to 
process.  In  LISP,  the  universal  machine  is  apply. 
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{Gaussian}  Pages  32,  68 

A typical  example  of  the  use  of  a pseudo-random  number  generator  is 
to  construct  a generator  for  pseudo-random  numbers  with  a Gaussian 
distribution  by  adding  up  a large  number  of  uniformly  distributed  pseudo- 
random numbers.  We  would  like  to  write  it  in  roughly  as  in  Figure  N3. 


(DEFINE  (GAUSSIAN) 

(WEBER  0 43)) 

(DEFINE  (WEBER  X N) 

(COND  ((>  N 0)  X) 

(T  (WEBER  (*  X (RANOOM))  (-  N 1))))) 

Figure  N3 

"Gaussian"  Pseudo-Random  Number  Generator 


This  code  should  add  up  43  pseudo-random  numbers  obtained  by  calling  random. 
We  cannot  write  such  a random  without  side  effects,  however.  We  can  arrange 
to  pass  the  seed  around,  as  in  Figure  N4. 


(DEFINE  (GAUSSIAN  SEED) 

(WEBER  0 43  SEED)) 

(OEFINE  (WEBER  X N SEED) 

(COND  ((=  N 0)  (CONS  X SEED)) 

(T  ((LAMBDA  (NEWSEED) 

(WEBER  ( + X NEWSEED)  <-  N 1)  NEWSEED)) 

(RANDOM  SEED))))) 

Figure  N4 

"Gaussian"  Pseudo-Random  Number  Generator,  Passing  SEED 

This  is  much  more  complicated.  The  user  of  gaussian  must  maintain  the  seed. 
Moreover,  gaussian  and  weber  each  need  to  return  two  values;  here  we  cons 
them  together,  and  the  user  must  take  them  apart. 

! 
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{labels)  Pages  29,  59 

This  technique  can  be  generalized  to  allow  the  definition  of 
recursive  local  procedures.  (Although  the  Y-operator  discussed  in  {Note  Y- 
operator)  can  be  used  to  implement  recursive  local  procedures,  it  is 
extremely  painful  to  construct  several  mutually  recursive  procedures. 
Although  mutually  recursive  procedures  can  be  theoretically  eliminated  (by 
procedure  integration),  this  process  destroys  the  conceptual  structure  of 
the  program.) 

Consider  writing  a procedure  to  construct  the  reverse  of  a given 

list: 

(DEFINE  (REVERSE  L) 

( REVERSE  1 L '(  ))) 

I. 

(DEFINE  ( REVERSE  1 OLD  NEW) 

(CONO  ((NULL  OLD)  NEW) 

(T  ( REVERSE1  (CDR  OLD)  (CONS  (CAR  OLD)  NEW))))) 

The  procedure  reversei  is  irrelevant  to  the  outside  world;  we  would  like  to 
hide  it  inside  reverse. 

Let  us  invent  a new  construction  to  permit  the  definition  of  local 
procedure  definitions  with  names: 

(LABELS  (((fj  vu  v12  ...)  bodyj) 

(<f?  v21  ...)  body2) 

^fN  VN1  VN2  • • - ) bodV) 

body) 

means  the  value  of  body  when  evaluated  in  an  environment  where  the 
specified  procedure  definitions  are  available.  For  example: 

(DEFINE  (REVERSE  L) 

(LABELS  (((REVERSE!  OLO  NEW) 

(CONO  ((NULL  OLO)  NEW) 

(T  (REVERSEI  (COR  OLO)  (CONS  (CAR  OLO)  NEW)))))) 

(REVERSEI  L ’()))) 

The  same  trick  works  for  labels  as  for  the  top  level:  when  lookupi  has  found 
a LABELS-def incd  function,  it  has  the  correct  environment  in  hand  for 
constructing  a »PROCEDURE-ob  ject . We  need  only  add  a test  in  eval  for  the 
labels  construct,  and  arrange  for  the  appropriate  slabeled- objects  to  be 
constructed  (see  Figure  N5). 


(DEFINE  (EVAL  EXP  ENV) 

(CONO  ((ATOM  EXP) 

(CONO  ((NUMBERP  EXP)  EXP) 

(T  (VALUE  EXP  ENV)))) 

((EQ  (CAR  EXP)  'QUOTE) 

(CADR  EXP)) 

( ( EO  (CAR  EXP)  ’LAMBDA) 

(LIST  ’^PROCEDURE  (CADR  EXP)  (CAOOR  EXP)  ENV)) 

((EO  (CAR  EXP)  ’LABELS) 

(EVLABELS  (CADR  EXP)  EXP  '()  ’()  ENV)) 

((EQ  (CAR  EXP)  ’CONO) 

( EVCONO  (COR  EXP)  ENV)) 

(T  (APPLY  (EVAL  (CAR  EXP)  ENV) 

(EVLIS  (COR  EXP)  ENV))))) 

(DEFINE  (EVLABELS  DEFINITIONS  EXP  NAMES  FNS  ENV) 

(COND  ((NULL  DEFINITIONS) 

(EVAL  (CADOR  EXP)  (BIND  NAMES  FNS  ENV))) 

(T  (EVLABELS  (COR  DEFINITIONS) 

EXP 

(CONS  (CAAAR  DEFINITIONS)  NAMES) 

(CONS  (LIST  ' &LABELED 

(COAAR  DEFINITIONS) 

(CADAR  DEFINITIONS)) 

FNS) 

ENV)))) 

For  value,  lookup,  and  bind  see  Figure  3. 

For  evcond  and  evlis  see  Figure  5. 

For  apply  see  Figure  7. 

For  lookupi  see  Figure  10  (not  Figure  3). 

Figure  N5 

An  Evaluator  For  Local  Lexical  Scoping, 
Dynamic  Top-Level  References, 
and  Local  Definition  of  Recursive  Procedures 
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(labels  with  Side  Effects)  Page  37 

This  implementation  of  labels  (see  Figure  N6)  applies  the  technique 
of  (Note  Driver  Loop  with  Side  Effects)  to  the  implementation  of  labels  in 
(Note  labels).  This  is  in  fact  how  labels  (or  its  cousin  label)  Is  usually 
implemented  in  "real"  LISP  systems. 


(OEFINE  (EVLABELS  DEFINITIONS  EXP  NAMES  FNS  ENV) 

(COND  ((NULL  DEFINITIONS) 

(EVLABELS-CLOSE  (CADR  EXP)  EXP  NIL  (BIND  NAMES  FNS  ENV))) 
(T  (EVLABELS  (COR  DEFINITIONS) 

EXP 

(CONS  (CAAR  DEFINITIONS)  NAMES) 

(CONS  '&UNASS IGNED  FNS) 

ENV)))) 

(DEFINE  (EVLABELS-CLOSE  DEFINITIONS  EXP  VALS  ENV) 

(COND  ((NULL  DEFINITIONS) 

(EVLABELS-CLOBBER  NIL  EXP  ( CDAR  ENV)  VALS  ENV)) 

(T  (EVLABELS-CLOSE  (COR  DEFINITIONS) 

EXP 

(CONS  (LIST  '&PROCEDURE 

( COAAR  DEFINITIONS) 

(CADAR  DEFINITIONS) 

ENV) 

VALS) 

ENV)))) 

(DEFINE  (EVLABELS-CLOBBER  HUNOZ  EXP  SLOTS  VALS  ENV) 

(COND  ((NULL  VALS) 

(EVAL  ( CADDR  EXP)  ENV)) 

(T  (EVLABELS-CLOBBER  ( RPLACA  SLOTS  (CAR  VALS)) 

F.XP 

(CDR  SLOTS) 

(COR  VALS) 

ENV)))) 

For  eval  and  evsetq  see  Figure  11. 

For  lookupi  see  Figure  3 (not  Figure  10,  despite  Figure  11!). 


Figure  N6 

Implementation  of  labels  Using  Side  Effects 
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A primitive  operator  might  be  a very  complicated  object  in  a "real" 
LISP  implementation;  it  would  probably  have  machine-language  code  within 
it.  We  are  not  interested  in  the  details  of  a particular  host  machine 
here;  we  wish  only  to  present  a simple  meta-circular  definition  of  primop 
and  pr  i mop -apply.  We  will  notate  the  procedural  object  which  is  the  value  of 
car  (say)  in  the  initial  top-level  environment  <the-primmive-proceoures>  as 
"scar".  This  object  has  no  interesting  properties  except  that  it  is  EQ  to 
itself  and  not  to  any  other  object.  The  initial  top-level  environment 
therefore  looks  like: 

(((CAR  COR  EQ  ATOM  NULL  NUMBERP  + - * ...) 

SCAR  SCOR  SEQ  SATOM  SNULL  SNUMBERP  6+  S-  6*  ...)) 

Given  this,  we  can  define  primop  and  primop-apply  as  in  Figure  N7. 


(OEFINE  (PRIMOP  FUN) 

(CONO  ((EQ  FUN  'SCAR)  T) 

((EQ  FUN  'SCOR)  T) 

((EQ  FUN  'SEQ)  T) 

((EQ  FUN  'SATOM)  T) 

((EQ  FUN  'SNULL)  T) 

((EQ  FUN  ’SNUMBERP)  T) 

((EQ  FUN  *S+)  T) 

((EQ  FUN  ’S-)  T) 

((EQ  FUN  *S«)  T) 

(T  NIL))) 

(OEFINE  (PRIMOP-APPLY  FUN  AR6S) 

(CONO  ((EQ  FUN  'SCAR)  (CAR  (CAR  ARGS))) 

((EQ  FUN  'SCOR)  (COR  (CAR  ARGS))) 

((EQ  FUN  'SEQ)  (EQ  (CAR  ARGS)  (CAOR  ARGS))) 
((EQ  FUN  'SATOM)  (ATOM  (CAR  ARGS))) 

((EQ  FUN  'SNULL)  (NULL  (CAR  ARGS))) 

((EQ  FUN  'SNUMBERP)  (NUMBERP  (CAR  ARGS))) 

((EQ  FUN  ' S+ ) (+  (CAR  ARGS)  (CAOR  ARGS))) 

((EQ  FUN  'S-)  (-  (CAR  ARGS)  (CAOR  ARGS))) 

((EQ  FUN  'S»)  (*  (CAR  ARGS)  (CAOR  ARGS))) 

(T  (ERROR)))) 


Figure  N7 

Meta-Circular  Definition  of  primop  and  primop-apply 


{ progn  Wizardry)  Page  35 


We  defined  evprogn  in  the  way  shown  in  Figure  11  rather  than  in  this 
"more  obvious"  way: 

(DEFINE  (EVPROGN  EXPS  ENV  LASTVAL) 

(COND  ((NULL  EXPS)  LASTVAL) 

(T  (EVPROGN  (COR  EXPS)  ENV  (EVAL  (CAR  EXPS)  ENV))))) 

for  a technical  reason:  we  would  like  the  tail-recursive  properties  of  the 
code  being  interpreted  to  be  reflected  in  the  interpretation  process.  We 
specifically  want  recursive  calls  as  the  last  subform  of  a progn  form  to  be 
tail-recursive  if  the  progn  form  itself  is  in  a tail-recursive  situation. 
For  example,  we  might  write  a loop  such  as: 

(DEFINE  (PR1NTLOOP  X) 

(COND  ((=  X 0)  'BLASTOFF) 

(T  (PROGN  (PRINT  X) 

(PRINTLOOP  (•  X 1)))))) 

We  would  like  this  loop  to  be  iterative,  but  it  can  be  iterative  only  if 
the  recursive  call  to  printloop  is  tail-recursive.  Our  point  is  that  if  the 
"obvious"  version  of  evprogn  is  used  in  the  interpreter,  then  interpretation 
of  printloop  will  not  be  tail-recursive  because  of  the  "stacking  up  of  evprogn 
frames"  (the  last  call  to  eval  from  evprogn  is  not  tail-recursive).  This  is 
unnecessary  because  evprogn  does  nothing  with  the  last  value  but  return  it 
anyway. 

By  the  way,  the  use  of  progn  in  a cono  clause  as  shown  above  in 
printloop  is  a very  common  situation,  as  is  the  use  of  a progn  as  the  body  of 
a procedure  (cf.  George's  last  experimental  version  of  mapcar).  As  a 
convenience,  most  real  lisp  implementations  define  extended  versions  of  cono 
and  lambda  which  implicitly  treat  clauses  (resp.  bodies)  as  progn  forms  (see 
Figure  N8).  This  allows  us  to  write  such  things  as: 

(DEFINE  (PRINTLOOP  X) 

(SLEEP  1) 

(CONO  ((*  X 0)  'BLASTOFF) 

(T  (PRINT  X) 

(PRINTLOOP  (-  X 1))))) 
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(OEF  1NE  (EVCONO  CLAUSES  ENV) 

(COND  ((NULL  CLAUSES)  (ERROR)) 

( ( EVAL  (CAAR  CLAUSES)  ENV) 

( EVPROGN  (COAR  CLAUSES)  ENV  NIL)) 

(T  (EVCOND  (COR  CLAUSES)  ENV)))) 

( OEF INE  (APPLY  FUN  ARGS) 

(COND  ((PRIMOP  FUN)  (PRIMOP-APPLY  FUN  ARGS)) 

((EQ  (CAR  FUN)  '^PROCEDURE) 

(EVPROGN  (COOR  FUN) 

(BIND  (CADR  FUN)  ARGS  (CADDDR  FUN)) 

NIL)) 

(T  (ERROR)))) 

For  eval  and  evprogn  see  Figure  11. 

Figure  N8 

Treating  cono  Clauses  and  Procedure  Bodies  as  Implicit  progn  Forms 


Finally,  we  note  that  progn  is  unnecessary  except  as  a programming 
convenience.  Because  the  language  is  defined  to  be  executed  in  applicative 
order  (cf.  {Note  Normal  Order  Loses)  in  [Revised  Report]),  we  can  force 
the  sequencing  of  evaluation,  as  well  as  throw  away  unwanted  values,  by 
using  LAMBDA-expressions.  We  first  note  that 

(PROGN  e,  e?  ...  eN  j eN)  s (PROGN  e j (PROGN  . . . (PROGN  eN  j eM)  ...  )) 

so  that  we  need  worry  only  about  progn  with  two  subforms: 

(PROGN  «,  e;)  s ((LAMBDA  (HUNOZ  F)  (F)) 

el 

(LAMBDA  ( ) e2)) 

(see  [Imperative]  and  [Revised  Report]). 
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{ouote  Mapping}  Page  7 

What  the  quote  notation  achieves  is  a simple  mapping  of  the  entire 
set  of  S-expressions  into  a subset  of  itself;  this  mapping  is  trivially 
invertible.  This  is  necessary  in  order  to  leave  some  S-expressions  left 
over  to  represent  other  things. 

This  idea  may  be  applied  to  natural  numbers  as  well.  We  can 
"quote"  a number  by  doubling  it.  In  this  way  every  even  number  represents 
half  of  itself,  just  as  the  5-expression  (ouote  a)  represents  the  S- 
expression  in  its  cadr.  This  leaves  all  the  odd  numbers  for  other 
purposes.  For  example,  we  can  define  an  ordered  set  of  variables  and  let 
3n  encode  the  N'th  variable,  for  n>io.  We  can  also  let  31  mean  cond,  3Z  mean 
lambda,  etc.  We  can  then  encode  a procedure  call  as  $f 7xnyi3z . . . where  f is 
the  encoding  of  the  procedure  and  x,  y,  *,  ...  are  the  encodings  of  the 
arguments;  cono  forms  and  LAMBOA-expressions  can  be  similarly  encoded.  For 
example , 


(CONO  ((NULL  A)  3)  (T  6)) 


might  be  encoded  as  the  number 


31<Z1  310  a a 

31,(5<5  ; >76)11(53  712) 


5V  7 


In  this  manner  we  can  encode  all  of  the  LISP  language  as  natural  numbers. 
This  is  an  example  of  the  technique  of  "Godelization" . 


{quote  Shafts  the  Compiler}  Page  19 

We  emphasize  that  it  is  not  the  presence  of  dynamically  scoped 
variables  which  makes  standard  LISP  difficult  for  compilers,  but  the  very 
fact  that  the  lAMBDA-expressions  are  quoted.  It  is  impossible  in  general  to 
determine  whether  a quoted  S-expression  is  intended  to  be  code  or  just  some 
constant  data.  Most  LISP  systems  provide  another  kind  of  quote  called 
function.  In  LISP  1 [LISP  IN]  and  LISP  1.5  [LISP  1.5M]  this  used  to  produce 
funarg  objects  (we  call  them  sproceoure  objects),  but  in  more  recent  LISP 
systems  [Moon]  [Teitelman]  an  ordinary  FUNCTiON-expression  has  been  made 
equivalent  to  a quoted  expression,  serving  only  as  a flag  to  the  compiler 
that  the  quoted  expression  is  intended  as  code.  However,  the  introduction 
of  the  notation  for  quoted  expressions  has  led  many  programmers  to 
prefer  the  use  of  quote  to  function  for  reasons  of  conciseness.  This  in  turn 
has  required  changes  to  the  compiler  to  specially  recognize  standard 
situations  where  this  is  used  (e.g.  the  functional  argument  to  mapcar),  but 
this  patch  doesn't  solve  the  problem  generally. 
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{rplaca  Can  Alter  car  Instead) 


Page  40 


We  have  implicitly  thought  of  the  rplaca  operation  as  modifying  a 
cons  so  as  to  have  a different  car.  However,  there  is  an  interpretation  in 
which  rplaca  is  thought  of  as  modifying  the  car  operator.  Taking  the  car  of 
an  object  always  involves  both  the  car  operator  and  the  object.  When  we 
perform  an  rplaca  on  object  denoted  by  foo,  all  we  can  say  is  that  the  value 

of  (car  foo)  may  have  changed.  It  is  not  necessarily  clear  what  aspect  of 

that  expression  has  changed.  Using  this  idea,  we  can  express  rplaca  in 

terms  of  setq  as  in  Figure  N9.  Note  that  we  depend  on  eq  to  distinguish 

different  results  of  cons. 


(DEFINE  (RPLACA  X Y) 

( PR0GN  ((LAMBDA  (OLDCAR) 

(SETQ  CAR 

(LAMBDA  (Z) 

(CONO  ((E0  Z X)  Y) 

(T  (OLDCAR  X)))))) 

CAR) 

X)) 


Figure  N9 

rplaca  in  Terms  of  seto  Which  Modifies  car 
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{ S-expression  Postulates  and  Notation)  Pages  4,  41 

5-expressions  form  a number  system  analogous  to  that  for  the 
natural  numbers.  F.ach  can  be  used  to  encode  arbitrary  strings  of  symbols 
by  means  of  "Godelization" , but  the  S-expression  encoding  is  usually  far 
more  convenient  than  the  numerical  encoding. 

We  repeat  here  the  informal  characterization  of  Peano's  postulates 
and  the  analogous  postulates  for  S-expressions  from  [Levin]: 

The  Postulates  of  Arithmetic 

1.  Zero  is  a number. 

Z.  The  successor  of  a number  is  a number. 

3.  Zero  is  not  the  successor  of  any  number. 

4.  No  two  numbers  have  the  same  successor. 

5.  (Induction  Principle)  Any  property  which  is  true  for  zero,  and 
is  such  that  if  it  is  true  for  some  number  it  is  also  true  for 
the  successor  of  that  number,  it  is  true  for  all  numbers. 

Zero  is  notated  as  o,  and  the  successor  of  any  number  n is  notated 
as  N1  . As  a convenience  we  define  alternative  notations  for  numbers  other 

than  zero,  such  as  decimal  place-value  notation.  Thus  for  o we 

often  write  13. 


The  Postulates  for  S-expressions 

1.  Atoms  are  S-expressions. 

2.  The  cons  of  any  two  S-expressions  is  an  S-expression. 

3.  An  atom  is  not  the  cons  of  any  two  S-expressions.. 

4.  If  o differs  from  (5,  or  if  Y differs  from  8,  then  cons  of  a 
and  Y differs  from  cons  of  (3  and  8. 

5.  (Induction  Principle)  Any  property  which  is  true  of  all  atoms, 
and  is  such  that  if  it  is  true  for  two  S-expressions  it  is  also 
true  for  their  cons,  is  true  for  all  S-expressions. 

Atoms  are  notated  ns  strings  of  letters  and  digits.  The  cons  of 
two  S-expressions  a and  (3  is  notated  (o  . (5).  As  a convenience,  we  define 
alternative  notations  for  some  commonly  used  forms  of  S-expression,  such  as 
list  notation.  The  atom  nil  is  called  the  "empty  list";  we  write  it 
as  ().  If  (a  (i  r ...  8)  is  (the  notation  for)  a list  n (where  the  "...”  is 
meant  as  a meta-syntactic  ellipsis),  then  the  cons  of  c and  n is  written 
(«  or  (i  Y ...  8).  We  also  define  quotation  notation,  in  which  (OUOTt  a)  is 
written  as  'a. 

(This  definition  of  S-expressions  applies  to  "pure  LISP",  which  has 
no  side  effects.  In  Part  Two,  when  the  rplaca  and  rplacd  operators  are 
introduced,  the  phrase  "the  cons  of"  will  not  be  well-defined.) 
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{This  ain't  A-lists)  Page  11 

Our  symbol  table  routines  are  not  the  same  as  those  in  LISP  1.5. 
Their  behavior  is  approximately  the  same,  but  the  data  structures  involved 
differ.  The  LISP  1.5  routines  (pairlis  and  assoc)  use  the  traditional 
"association  list"  format: 


{Value  Quibble)  Page  8 

"Did  he  ever  return? 

No,  he  never  returned, 

And  his  fate  is  still  unlearned.,." 

— The  Han  Who  Never  Returned 
(Charlie  on  the  HTA) 

We  said  "eval's  purpose  is  to  determine  the  values  of  expressions". 
But  what  is  the  value  of  the  expression  (driver)?  It  is  certainly  not  an 
illegal  or  useless  expression  to  evaluate,  yet  it  has  no  value.  The 
purpose  of  the  expression  is  to  cause  a certain  process  to  be  evolved;  it 
is  an  "infinite  loop",  which  never  returns.  This  process  includes  side 
effects  (read  and  print)  through  which  it  interacts  with  the  user.  This 
situation  arises  because  the  system  of  interest  is  broken  into  two  parts 
with  independent  state:  the  computer  and  the  user.  We  will  have  more  to 
say  about  this  later. 
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it's  hard  to  make  it  stop!  These  problems  are  related.  The  structure  of 
random-driver  is  an  infinite  loop,  as  with  all  drivers.  Because  random-driver 
never  returns  a value,  there  is  no  way  to  get  an  answer  out  without  a side- 
effect  like  print. 

We  can  arrange  to  signal  random-driver  that  no  more  values  are 
desired,  and  to  return  a value  (see  Figure  Nil). 


(DEFINE  (RANDOM-DRIVER  F SEEO) 

(CONO  ((CAR  F)  (COR  F)) 

(T  ((LAMBDA  (NEWSEEO) 

(RANDOM-DRIVER  ((COR  F)  NEWSEEO)  NEWSEEO)) 

((LAMBDA  (Z) 

(CONO  ((>  Z 0)  Z) 

(T  (+  Z -32766. )))) 

<*  SEEO  899.)))))) 

(DEFINE  (GAUSSIAN  G) 

(WEBER  0 43  G)) 

(DEFINE  (WEBER  X N H) 

(COND  ((=  N 0)  (H  X)) 

(T  (CONS  NIL 

(LAMBDA  (R) 

(WEBER  (+  X R)  (-  N 1)  H)))))) 

(DEFINE  (DRIVER  USERFN ) 

(RANDOM-DRIVER  (GAUSSIAN  USERFN)  43)) 

Figure  Nil 

"Gaussian"  Random-Number  Generator  "Top  Level"  without  Side  Effects 


Using  this  new  definition,  we  can  write: 

(DEFINE  (P  R)  (CONS  T R)) 

(DRIVER  P 11) 

which  eventually  returns  one  "Gaussian"  number.  (Doing  something  with  more 
than  one  "Gaussian"  number  takes  a little  more  work...) 

Notice  that  in  order  to  make  this  work,  random-driver  had  to  know  an 
awful  lot  about  its  functional  argument;  a fairly  complicated  protocol  had 
to  be  developed  for  handshaking.  We  might  argue  that  this  exercise,  while 
it  has  indeed  removed  all  obvious  side  effects,  has  somewhat  tarnished  the 
modularity  of  the  random  program.  In  any  case,  the  structure  of  our  final 
program  is  not  exactly  what  we  had  in  mind  when  we  started. 
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(Y-operator)  Pages  26,  57 

While  the  interpreter  of  Figure  8 cannot  define  recursive 
procedures,  it  is  possible  to  define  recursive  procedures  by  using  a 
variant  of  the  "paradoxical  combinator",  also  known  as  the  Y-operator: 

(OEFINE  (Y  F) 

((LAMBDA  (6) 

(LAMBDA  (X) 

((F  (6  G))  X))) 

(LAMBDA  (G) 

(LAMBDA  (X) 

((F  (G  G))  X))))) 

Using  this  we  define  the  doubly-recursive  algorithm  for  computing  the 
Fibonacci  function: 

(DEFINE  (FIB  K) 

((Y  (LAMBOA  (F) 

(LAMBDA  (N) 

(CONO  ((■  N 0)  1) 

((*  N 1)  1) 

(T  (+  (F  (-  N 1))  (F  (-  N 2)))))))) 

K)> 

That  this  manages  to  work  is  truly  remarkable.  Notice  that  this  is  almost 
identical  to  the  label  construct  which  was  actually  introduced  by  LISP  1, 
though  at  the  time  it  was  invented  the  implementors  didn't  realize  this 
correspondence  [LISP  History]. 
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